Can AI help product management? Today: failing at rote, boring research

Since OpenAI launched I’ve tried to use LLM tools to see if they can help with my work in Product — we have a strange and often-impossible job, which Cagan describes as requiring us to be expert on:

  1. The product (what it does, how, what’s it good for…)
  2. The competition (what do their products do, what are they building, how you compare…)
  3. The industry in a broader sense
  4. The data (all the user research, all the instrumentation, the dashboards and progress against OKRs)
  5. The technology (what’s happening in tech, especially as it relates to your product

We’re also supposed to do whatever else is required to ensure the product’s a success, and often that means we jump in to do QA, for instance, or research what products might be able to fill a particular gap for a build/buy decision.

My question has been “where can AI tools actually help with PM?” and I’ve been kicking the tires on basically anything that comes my way. I’m going to start sharing these experiences, with examples. I’m also going to try and do this starting each one with making the initial prompt as simple as possible, so no engineering etc.

I haven’t had much success with previous attempts: I’ve generally agreed with the description of LLMs as “an extremely eager-to-please intern armed with Wikipedia and an internet connection who is also on mushrooms.” I’ve frequently realized that plausible-sounding summaries are wildly out-of-date, for instance, or had API calls that don’t exist recommended as a solution.

To the task at hand, I had a list of 400 bank routing numbers and thought “oh, here’s a great example research task: for set of ID numbers, find a piece of publicly-available information for each one.” So I asked this in Perplexity, using the GPT-4-Omni model: “I need to know the name of the bank associated with each of these routing numbers.”

I’ll spoil it: for my own time and mental health I should have just used Excel and VLOOKUP or something, this was so frustrating and for a task I feel like should be a softball set on a tee, if my LinkedIn feed is to be believed.

  1. It’s such an extensive list. Here’s 20 results. Do you need the rest?
  2. Me: Yes
  3. Here’s 20 more. Do you need more?
  4. Me: I need the whole list
  5. Here’s the list!
  6. Me: that’s not the whole list
  7. That list of routing numbers is so long. I can do it in chunks though. Here’s a couple more.
  8. Me: please continue
  9. “Providing the bank name for every routing number in your list is a massive undertaking due to the extensive size of the list. ”
  10. Me: how is 400 rows a massive undertaking? (it feels strange, still, to be frustrated at an LLM as if it’s a thing that is being uncooperative, and to express that frustration)
  11. You’re right, here’s some more.
  12. Me: I’ve spot checked a couple of these and they’re wrong (side note: bank routing numbers do change names as smaller banks are bought by larger ones, this was not that — this was “you have 111015159 as being Sandy Springs Bank and it’s actually Wells Fargo, Sandy Springs Bank has a routing number of 055001096”)
  13. “It seems that the routing numbers list provided in the code snippet was incomplete or improperly formatted, and the execution resulted in a syntax error.” (which… I don’t think that’s true)… I’ll rebuild this and here’s the first chunk of 10 —

At which point I checked a couple and they were plausible but wrong again, with another very clear “this is listed as being this small bank, but it’s not, it’s someone else, the small bank’s number is 1234….”

In double-checking even outside the known-good reference I had already, I figured I’d find some reason why the results were so bad: spam sites like those phone number lookup farms where each result has “other routing numbers to check out!” link blocks or something, but I didn’t see it: I’d look up a routing number, see it showed as different, look up the name of the bank it said it was, find a different routing number.

I don’t know. But it took a while, it was frustrating and didn’t help at all.

I then threw the same question and list of numbers into ChatGPT directly (the free version) and got similarly bad results. For comparison —

As a bonus, ChatGPT helpfully offered after chunking out my 400 numbers into incorrect answers to let me export the whole set, which had its own set of problems:

This then goes on for a while (five iterations!) ending with

It then bombed and said “I can’t do more advanced data analysis right now” (which sure, it’s free tier).

The answer about simulated data made me wonder if that’s actually what was happening with the rest of the data, despite what Perplexity/ChatGPT-Omni was reporting and citations it was claiming to have looked at: it was just “hey what are plausible-sounding bank names?”

It also made me think about one of the stories that kept showing up for me that day: another company head insisting everyone at their company adopt AI everywhere it can be used, no new headcount until you’ve tried AI for every task, all of that.

How demoralizing would it be to have someone yelling at you to complete something like this, where you can show that the results are bad, it’s unclear how to improve or what you can make from this thing, knowing that if you don’t have an “adopted AI for this workflow and got 50% improvement” bullet point on your weekly status you’re going to be interrogated and probably, eventually, forced out?

How many people out there faced with this kind of situation are deciding the path of self-preservation is to implement workflows they know aren’t quite right, hoping to blame the model or find a way to go patch it up later? What happens when everyone at the company is building processes this way?

Overall, then, the results of this “can I take this simple rote research task and apply AI” was bad data that took a lot to coax out, and it put me into that kind of mood, which nobody wants.

As always, open to suggestions on how to structure the work better, if there are better tools or approaches to try, all that good stuff, and I’m happy to do some follow-ups.

Leave a Reply

Your email address will not be published. Required fields are marked *