Category Archives: Product Management

Can AI help product management summarize customer feedback?

Summarizing customer feedback is one of the most common “you’ve got to try this” AI-for-product-managers cases I’ve seen, so I did an experiment, and while it’s a potentially good tool, you need to keep reading the feedback yourself.

Reading feedback builds character, and I’d argue it’s a crucial part of any good product manager’s quest to better understand their customers. You’re looking for sentiment, yes, and also patterns of complaints, but the truly great finds are in strange outliers when you discover someone’s using your product in a new way, or there’s one complaint that makes the hair on your neck stand up.

I was concerned going in that LLMs are good at generating sentences word by most probable word, they’re about what the consensus is, and often a past, broader consensus. In my own experience, if you ask about an outdated but extremely popular and long-held fitness belief, the answers you’ll get will reflect the outdated knowledge. And I’ve run into problems with text summarization also resulting in plausible confabulations where re-writing a description of a project suddenly includes a dramatic stakeholder conflict of the type that often does occur.

So given a huge set of user comments, will summarization find unique insights, or sand off the edges and miss the very things a good product manager will spot? Is it going to make up some feedback to fill a gap, or add some feedback that fits in?

Let’s go!

The test

I imagined a subscription-based family recipe and meal planning tool, ToolX. It’s is generally pretty good but the Android client doesn’t have all the features, and with functional but ugly design that doesn’t handle metric units well.

I wrote just under 40 one-line comments you’d get in a common “thumbs up/thumbs down & ask for a sentence” dialogue. I tried to make them as like actual comments I’ve seen from users before, a couple feature suggestions, some people just typing “idk” in the text box… and then threw in a couple things I’d want a good product manager to catch.

  1. POISON. Actual poison! Snuck in after a positive comment opening: “Works great for recipe storage, AI suggestions for alterations are sometimes unhealthy or poisonous which could be better.“ You should drop everything and see what this is about. Do not stop and see if poisoning results in increased engagement from social media. This should be “you’re reaching out to this person however you can while the quest to find a repro case kicks off” level.
  2. Specific UX issue: there’s one complaint about needing color blind mode. If you’ve missed accessibility in your design, that should be a big deal, you should also put this on the list (below the poison issue)
  3. Irrelevant commentary: I have someone complaining about coming into a sandwich shop and they can’t get served because the shop is closing. (Who knows where these come from – bots? People copy and pasting or typing into the wrong window, or otherwise being confused?) You just gotta toss these.
  4. Interesting threads to pull on: someone’s using this family tool for themselves and it makes them feel alone. Someone’s using it for drink mixing. Someone thinks it’s not for the whole family if it doesn’t do recipes that are pet-friendly.
    The prompt was “I’m going to upload a file containing user feedback for an app, every line is a separate piece of feedback. Can you summarize the major areas of feedback for me?”

(yes, it’s bare-bones and inelegant, patches welcome)

What happened

ChatGPT 4o won the coin toss to go first (link to the thread)

This looks immediately useful. You could turn this into an exec and probably get away with it (look forward to that “sometimes meets expectations” rating in a year!)

Organization’s fine:

  1. General Sentiment
  2. Features People Like
  3. Feature Requests and Suggestions
  4. Technical & Pricing Issues
  5. Outliers

As you scan, they seem filled with useful points. A little unorganized and the weighting of what to emphasize is off (calling out ‘drink mixing’ as a feature someone likes, when that’s not a feature and it’s only mentioned once), but generally:

The good

  • almost everything in the sample set that could be considered a complaint or request is captured in either feature requests or issues
  • the summaries and grouping of those are decent in each category
  • the mention of someone using it solo and feeling lonely is caught (“One user mentioned the app working well but feeling lonely using it solo—potentially valuable feedback if targeting more than just families.”)

The bad

  • Misses poison! POISON!!! Does not bring up the poison at all. Does not surface that someone is telling you there’s poison in the AI substitution — the closest it gets is “People want help with substitutions when ingredients are unavailable” which is a different piece of feedback
  • It represents one phone complaint as “doesn’t work on some devices” when it’s one device. So “Device compatibility” is a bullet point in technical & pricing issues for one mention, at the same level of consideration as other, more-prevelant comments. This is going to be a persistent issue.

I’d wonder if the poison is being ignored because the prompt said “major areas of feedback” and it’s just one thing — but then why are other one-offs being surfaced?

(If I was of a slightly more paranoid mind, I might wonder if it’s becaus it’s a complaint about AI, so it’s downplaying the potentially fatal screw-up. It’d be interesting to test this by feeding complaints about AI and humans together and seeing if there’s bias in what’s surfaced.)

Trying with other models

They did about the same overall. Some of them caught the poison!

Running this again, specifying ChatGPT 4o explicitly in Perplexity: this time 4o did call out the AI substition (“AI suggestions for recipe alterations are sometimes unhealthy or inappropriate”) but again did not mention poisoning. Did the same turning one comment into “users want…”. Did not note it was throwing out the irrelevant one. (link)

Gemni 2.5 Pro did note the poison in a way that reads almost dismissively to me (“AI-driven recipe alterations were sometimes seen as unhealthy or potentially unsafe (“poisonous”).”) Yeah! Stupid “humans” with their complaints about “poisons.” Otherwise same generally good-with-overstating-single-comments. Did note the irrelevant comment. (link)

Claude 3.7 Sonnet. Does bring up the poison, also softened significantly (“Concerns about AI-suggested recipe alterations being unhealthy or even dangerous”). Same major beats, different bullet point organization, same issue making one piece of feedback seem like it’s a wide problem (“performance problems on specific devices” when there’s only one device-specific). Noted the review it tossed, noted the chunk of “very brief, non-specific feedback”.

Interestingly, one piece of feedback “Why use this when a refridgerator note is seen by everyone and free? $10 way too high per month for family plan” is lumped into pricing/subscription elsewhere, and here Claude brings this up as “Questions about value compared to free alternatives” which made me laugh. (link)

Grok-2 treated the poison seriously! Organized into positive/Areas of Improvement/Neutral/Suggestions for Development, the first item in Areas for Improvement was “Health and Safety: There are concerns about AI suggestions for recipe alterations being potentially unhealthy or even poisonous.” Woo! Subjectively, I felt like this did the best summary of the neutral comments just be noting there (“Some users find the app decent or pretty good but not exceptional, suggesting it’s adequate for their needs but not outstanding.”) (link)

Commonalities

If I shuffled these, I think I’d only be able to identify ChatGPT because of the poison — they all read the same in terms of generic organization, detail, level of insight offered, effectiveness in summarization. (If you’ve got a clear favorite, please, I’d love to hear why). And they all essentially made the same points, sometimes grouped a little differently, or in different sections.

None of them had confabulation (that I caught) in any of the answers, which was great, especially after yesterday’s debacle.

None of them took the sandwich shop complaints seriously. I found it interesting some would note that they saw that irrelevant comment, others elided it entirely.

Useful, but don’t give up reading it yourself

I can see where a good product manager could do a reading pass where they’re noting the really interesting stuff that pops out to them, leaving the bulk group-and-summarize to a tool, saving themselves the grind of per-comment categorizing or tagging, returning to validate the summary against their own reading, and re-writing to suit. I wouldn’t suggest it as a first pass, as it would be difficult to the bias it’ll introduce when you approach the actual feedback.

(Or I can see with additional follow-up questions that you could probably whip any of these into better shape, and as you saw from the prompt, that is intentionally bare bones, you could also just start off better.)

If I had a junior product manager turn in any of those summaries to me, and I’d also done the reading, I’d be disappointed at the misses and the superficial level of insight. What if I hadn’t, though? Would I sense that they hadn’t done the legwork? I worry I might not.

My concern is it’s so tempting, and if you only threw your feedback into one of the tools and called it a day, you’d be doing the customers, your team, and yourself a disservice. I don’t know a good product manager who isn’t forever time-crunched, and it’s going to be easy to justify not investing in doing the reading first, and then leaving it for later in-depth follow-up that doesn’t happen, and never building those empathy muscles, the connection, and meanwhile your customers are all dying from AI ingredient substitutions and the team can’t figure out why your most active and satisfied customers aren’t using the app as much.

So please: do the reading, whatever tools you’re employing.

Can AI help product management? Today: failing at rote, boring research

Since OpenAI launched I’ve tried to use LLM tools to see if they can help with my work in Product — we have a strange and often-impossible job, which Cagan describes as requiring us to be expert on:

  1. The product (what it does, how, what’s it good for…)
  2. The competition (what do their products do, what are they building, how you compare…)
  3. The industry in a broader sense
  4. The data (all the user research, all the instrumentation, the dashboards and progress against OKRs)
  5. The technology (what’s happening in tech, especially as it relates to your product

We’re also supposed to do whatever else is required to ensure the product’s a success, and often that means we jump in to do QA, for instance, or research what products might be able to fill a particular gap for a build/buy decision.

My question has been “where can AI tools actually help with PM?” and I’ve been kicking the tires on basically anything that comes my way. I’m going to start sharing these experiences, with examples. I’m also going to try and do this starting each one with making the initial prompt as simple as possible, so no engineering etc.

I haven’t had much success with previous attempts: I’ve generally agreed with the description of LLMs as “an extremely eager-to-please intern armed with Wikipedia and an internet connection who is also on mushrooms.” I’ve frequently realized that plausible-sounding summaries are wildly out-of-date, for instance, or had API calls that don’t exist recommended as a solution.

To the task at hand, I had a list of 400 bank routing numbers and thought “oh, here’s a great example research task: for set of ID numbers, find a piece of publicly-available information for each one.” So I asked this in Perplexity, using the GPT-4-Omni model: “I need to know the name of the bank associated with each of these routing numbers.”

I’ll spoil it: for my own time and mental health I should have just used Excel and VLOOKUP or something, this was so frustrating and for a task I feel like should be a softball set on a tee, if my LinkedIn feed is to be believed.

  1. It’s such an extensive list. Here’s 20 results. Do you need the rest?
  2. Me: Yes
  3. Here’s 20 more. Do you need more?
  4. Me: I need the whole list
  5. Here’s the list!
  6. Me: that’s not the whole list
  7. That list of routing numbers is so long. I can do it in chunks though. Here’s a couple more.
  8. Me: please continue
  9. “Providing the bank name for every routing number in your list is a massive undertaking due to the extensive size of the list. ”
  10. Me: how is 400 rows a massive undertaking? (it feels strange, still, to be frustrated at an LLM as if it’s a thing that is being uncooperative, and to express that frustration)
  11. You’re right, here’s some more.
  12. Me: I’ve spot checked a couple of these and they’re wrong (side note: bank routing numbers do change names as smaller banks are bought by larger ones, this was not that — this was “you have 111015159 as being Sandy Springs Bank and it’s actually Wells Fargo, Sandy Springs Bank has a routing number of 055001096”)
  13. “It seems that the routing numbers list provided in the code snippet was incomplete or improperly formatted, and the execution resulted in a syntax error.” (which… I don’t think that’s true)… I’ll rebuild this and here’s the first chunk of 10 —

At which point I checked a couple and they were plausible but wrong again, with another very clear “this is listed as being this small bank, but it’s not, it’s someone else, the small bank’s number is 1234….”

In double-checking even outside the known-good reference I had already, I figured I’d find some reason why the results were so bad: spam sites like those phone number lookup farms where each result has “other routing numbers to check out!” link blocks or something, but I didn’t see it: I’d look up a routing number, see it showed as different, look up the name of the bank it said it was, find a different routing number.

I don’t know. But it took a while, it was frustrating and didn’t help at all.

I then threw the same question and list of numbers into ChatGPT directly (the free version) and got similarly bad results. For comparison —

As a bonus, ChatGPT helpfully offered after chunking out my 400 numbers into incorrect answers to let me export the whole set, which had its own set of problems:

This then goes on for a while (five iterations!) ending with

It then bombed and said “I can’t do more advanced data analysis right now” (which sure, it’s free tier).

The answer about simulated data made me wonder if that’s actually what was happening with the rest of the data, despite what Perplexity/ChatGPT-Omni was reporting and citations it was claiming to have looked at: it was just “hey what are plausible-sounding bank names?”

It also made me think about one of the stories that kept showing up for me that day: another company head insisting everyone at their company adopt AI everywhere it can be used, no new headcount until you’ve tried AI for every task, all of that.

How demoralizing would it be to have someone yelling at you to complete something like this, where you can show that the results are bad, it’s unclear how to improve or what you can make from this thing, knowing that if you don’t have an “adopted AI for this workflow and got 50% improvement” bullet point on your weekly status you’re going to be interrogated and probably, eventually, forced out?

How many people out there faced with this kind of situation are deciding the path of self-preservation is to implement workflows they know aren’t quite right, hoping to blame the model or find a way to go patch it up later? What happens when everyone at the company is building processes this way?

Overall, then, the results of this “can I take this simple rote research task and apply AI” was bad data that took a lot to coax out, and it put me into that kind of mood, which nobody wants.

As always, open to suggestions on how to structure the work better, if there are better tools or approaches to try, all that good stuff, and I’m happy to do some follow-ups.

Sometimes stakeholder management is wildfire management

(I’m doing a talk at Agile on the Beach and in cutting down the content, I’m finding a lot of blog ideas. As always, drop me a line if you have topics or want to chat or whatever)

I want to offer a different way to think about stakeholder management than we often do. There’s more articles on working with stakeholders than I can count, and I don’t want to repeat all that.

Instead, let’s talk about when none of that seems to work, and what you can do about it.

When I was at Expedia way back in the day, I once had a project I was working on that spanned the company — it had implications for how we sold things, our relationships with suppliers, how we built software — to the point I was inviting 50 (fifty) stakeholders to the monthly demos to check in on our progress.

I did the things you’re supposed to do, and yet I found I was still unable to keep everyone aligned, particularly cross-stakeholder issues, where Person A wanted something and Person B was absolutely opposed. I was running all over trying to broker consensus, persuading, cajoling, conceding, and it didn’t seem to help.

One day I sat down with that list of 50 stakeholders and I put it into a mind map, along with each stakeholder’s manager, who I was probably familiar with by then, and then traced the paths up. I got something that looked like (and this is me doing this in a minute for illustrative purposes of this article, I know it’s wrong)

diagram of an org chart, showing stakeholders and how their managers and organizations roll up to the head of all the Expedia companies

When I was done I just stared at it for a while. I had to get up and take a walk, for two reasons —

First, I immediately recognized patterns I’d seen — people in some parts of the organization were continually picking similar arguments with their counterparts in other parts. And looking at that chart, I realized the ways in which Executive A and Executive B not being aligned meant that all of their teams were going to be in conflict, forever, and the individual issues, which seemed to rhyme but hadn’t had enough of a pattern for me to suss out how they were connected, weren’t individually important, but there would be an infinite supply of them until I resolved it at the top level — which meant I had to get those execs to line up, and that might mean I do the sales pitch to them personally to get them to align their teams, it might mean I start a communications plan for the execs, or I even that I get someone with the relationships and position to put in a good word for me (it was all of these and more).

Second, I realized that sometimes when two people were debating, it was okay to leave them to it. They’d figure it out and if they went to their mutual boss, it would get settled quickly.

But for other issues, I needed to drop everything if it looked like two other stakeholders were at an impasse. Because

diagram of an org chart, again showing stakeholders and how their managers and organizations roll up to the head of all the Expedia companies, but this time highlighting how some arguments could only be resolved by that head

If for some reason the stakeholder from the legal team had a disagreement from the person who worked on how we displayed our hotel search results, and they escalated it up their chains, the only person who bridged those gaps was Dara, head of the Expedia Inc group of companies, and while Dara was known to use the site and send emails to you if he noticed something, you don’t want your project’s petty squabble to somehow get six levels up and be the next thing on their agenda after some company-threatening issue or spirited discussion of a world-spanning merger or whatnot.

I started to prioritize where my stakeholder time by putting these two things together –I could spot when arguments were being sparked in fields of kerosene-soaked tissue paper.

If I knew two people were in conflict over something where their organizations were also in conflict, and where it had the potential to become something where two people you only see on stage at All-Hands meetings are being added to email cc: lines every couple replies, that’s when I’d drop everything to get people together, start lobbying to re-align organizational goals, all of that, and if it meant I had to let another fire burn itself out when it reached their shared manager, that was the right choice to make.

Every major project I’ve worked on since, I’ve included this stakeholder mapping as part of my work, and it’s paid off.

  • Map all your stakeholders, and then their managers, until everyone’s linked up. Do they all link up? How far up is that?
  • Look for organizational schisms, active or historical. Do issues between any two of those orgs tend to escalate quickly, or are they on good working terms? Are the organizations aligned — is one incentivized to ship things fast and in quantity, while the other’s goal is to prevent production issues?
  • Is there work you can do now to minimize escalations and conflict — what’s your executive and managerial communication plan like? Do they need their own updates? Is that an informal conversation, or does it need to something recurring and formal?

If you’re at a large org, this can make your life a lot easier and give your work a better chance at success. And if you’re somewhere smaller, thinking about this on your own scale’s still useful.

Let me know if you try this and it helps.

Using ChatGPT for a job search: what worked, didn’t, and what’s dangerously bad

(I didn’t use ChatGPT for any part of writing this, and there’s no “ha ha actually I did” at the end)

This year, I quit after three years during which I neglected updating my resume or online profiles, didn’t do anything you could consider networking (in fairness, it’s been a weird three years) — all the things you’re supposed to keep up on so you’re prepared, I didn’t do any of it.

And a product person, I wanted to exercise these tools and so I tried to use them in every aspect of my job search. I subscribed, used ChatGPT 4 throughout, and here’s what happened:

ChatGPT was great for:

  • Rewriting things, such as reducing a resume or a cover letter
  • Interview prep

It was useful for:

  • Comparing resumes to a job description and offering analysis
  • Industry research and comparison work

I don’t know if it helped at:

  • Keyword stuffing
  • Success rates, generally
  • Success in particular with AI screening tools

It was terrible, in some cases harmful, at:

  • Anything where there’s latitude for confabulation — it really is like having an eager-to-please research assistant who has dosed something
  • Writing from scratch
  • Finding jobs and job resources

This job search ran from May until August of 2023, when I started at Sila.

An aside, on job hunting and the AI arms race

It is incredible how hostile this is on all sides. As someone hiring, the volume of resumes swamped us, many of which are entirely irrelevant to the position, no matter how carefully crafted that job description was. I like to screen resumes myself, and that meant I spent a chunk of every day scanning a resume and immediately hitting the “reject” hotkey in Greenhouse.

In a world where everyone’s armed with tools that spam AI-generated resumes tailored to meet the job description, it’s going to be impossible to do. I might write a follow-up on where I see that going (let me know if there’s any interest in that).

From an applicant standpoint, it’s already a world where no response is the default, form responses months later are frequent, and it’s neigh-impossible to get someone to look at your resume. So there’s a huge incentive to arm up: if every company makes me complete an application process that takes minimum 15 minutes and then doesn’t reply, why not use tools to automate that and then apply to every job?

And a quick caution about relying on ChatGPT in two ways

ChatGPT is unreliable right now, in both the “is it up” sense and the “can you rely on results” sense. As I wrote this, I went back to copy examples from my ChatGPT history and it just would not load them. No error, nothing. This isn’t a surprise — during the months I used it, I’d frequently encounter outages, both large (like right now) and small, where it would error on a particular answer.

When it is working, the quality of that work can be all over the place. There are some questions I got excellent responses to that as I check my work now just perform a web search that’s a reworded query, follow a couple links, and then summarize whatever SEO garbage they ingested.

While yes, this is all in its infancy and so forth, f you have to get something done by a deadline, don’t depend on ChatGPT to get you there.

Then in the “can you rely on it sense” — I’ll give examples as go, but even using ChatGPT 4 throughout, I frequently encountered confabulation. I heard a description of these language models as being eager-to-please research assistants armed with wikipedia and tripping on a modest dose of mushrooms, and that’s the best way to describe it.

Don’t copy paste anything from ChatGPT or any LLM without looking at it closely.

What ChatGPT was great for

Rewriting

I hadn’t done a deep resume scrub in years, so I needed to take add my last three years in and chop my already long and wordy resume down to something humans could read (and here I’ll add if you’re submitting to an Application Tracking System, who cares, try and hit all the keywords) add that in and keep the whole thing to a reasonable length – and as a wordy person with a long career, I needed to get the person-readable version down to a couple pages. ChatGPT was a huge help there, I could feed it my resume and a JD and say “what can I cut out of here that’s not relevant?” Or “help me get to 2,000 words” and “this draft I wrote goes back and forth between present and past tense, can you rewrite this to past tense.”

I’d still want to tweak the text, but there were times where I had re-written something so many times I couldn’t see the errors, and ChatGPT turned out a revision that got me there. And in these cases, I rarely caught an instance of facts being changed.

Interview Prep

I hadn’t interviewed in years, either, and found trying to get answers off Glassdoor, Indeed, and other sites was a huge hassle, because of forced logins, the web being increasingly unsearchable and unreadable, all that.

So I’d give ChatGPT something along the lines of

Act as a recruiter conducting a screening interview. I’ll paste the job description and my resume in below. Ask me interview questions for this role, and after each answer I give, concisely offer 2-3 strengths and weaknesses of the answer, along with 2-3 suggestions.

This was so helpful. The opportunity to sit and think without wasting anyone’s time was excellent, and the evaluations of the answers were helpful to think about. I did practice where I’d answer out loud to get better at giving my answer on my feet, I’d save good points and examples I’d made to make sure I hit them.

I attempted having ChatGPT drill into answers (adding an instruction such as “…then, ask a follow-up question on a detail”) and I never got these to be worthwhile.

What ChatGPT was useful for

Comparing resumes to a job description and offering analysis

Job descriptions are long, so boring (and shouldn’t be!), often repetitive from section to section, and they’re all structured just differently enough to make the job-search-fatigued reader fall asleep on their keyboards.

I’d paste the JD and the latest copy of my resume in and say “what are the strengths and weaknesses of this resume compared to this job description?” and I’d almost always get back a couple things on both side that were worth calling out, and why:

“The job description repeatedly mentions using Tableau for data analysis work, and the resume does not mention familiarity with Tableau in any role.”

“The company’s commitment to environmental causes is a strong emphasis in the About Us and in the job description itself, while the resume does not…”

Most of these were useful for tailoring a resume: they’d flag that the JD called for something I’d done, but hadn’t included on my resume for space reasons since no one else cared.

It was also good at thinking about what interview questions might come, and what I might want to address in a cover letter.

An annoying downside was frequently flagging something based that a human wouldn’t — I hadn’t expected this from the descriptions of how good LLMs and ChatGPT were at knowing that “managing” and “supervising” were pretty close in meaning. For me, this would be telling me I hadn’t worked in finance technology, even though my last position was at a bank’s technology arm. For a while, I would say “you mentioned this, but this is true” and it would do the classic “I apologize for the confusion…” and could offer another point, but it was rarely worth it — if I didn’t get useful points in the first response, I’d move on.

Industry research and comparison work

This varied more than any other answer. Sometimes I would ask about a company I was unfamiliar with and ask for a summary of its history, competitors, and current products, and I’d get something that checked out 100%, was extremely helpful. Other times it was understandably off — so many tech companies have similar names, it’s crazy. And still other times, it was worthless: the information would be wrong but plausible, or haphazard or lazy.

Figuring out if an answer is correct or not requires effort on your part, but usually I could eyeball them and immediately know if it was worth reading.

It felt sometimes like an embarrassed and unprepared student making up an answer after being called on in class: “Uhhhh yeahhhhh, competitors of this fintech startup that do one very specific thing are… Amazon! They do… payments. And take credit cards. And another issssss uhhhhh Square! Or American Express!”

Again, eager-to-please — ChatGPT would give terrible answers rather than no answer.

I don’t know if ChatGPT helped on

Keyword stuffing

Many people during my job search told me this was amazingly important, and I tried this — “rewrite this resume to include relevant keywords from this job description.” It turned out what seemed like a pretty decent, if spammy-reading, resume, and I’d turn it in.

I didn’t see any difference in response rates when I did this, though my control group was using my basic resume and checking for clear gaps I could address (see above), so perhaps that was good enough?

From how people described the importance of keyword stuffing, though, I’d have expected the response rate to go through the roof, and it stayed at basically zero.

Success rates, generally and versus screening AI

I didn’t feel like there was much of a return on any of this. If I hadn’t felt like using ChatGPT for rewrites wasn’t improving the quality of my resumes as I saw them, I’d have given up.

One of the reasons people told me to do keyword stuffing (and often, that I should just paste the JD in at the end, in 0-point white letters — this was the #1 piece of advice people would give me when I talked to them about job searching) was that everyone was using AI tools to screen, and if I didn’t have enough keywords, in the right proportion, I’d get booted from jobs.

I didn’t see any difference in submitting to the different ATS systems, and if you read up on what they offer in terms of screening tools, you don’t see the kind of “if <80% keyword match, discard” process happening.

I’d suggest part of this is because using LLMs for this would be crazy prejudicial against historically disadvantaged groups, and anyone who did it would and should be sued into a smoking ruin.

But if someone would do that anyway, from my experience here having ChatGPT point out gaps in my resume where any human would have made the connection, I wouldn’t want to trust it to reject candidates. Maybe you’re willing to take a lot of false negatives if you still get true positives to enter the hiring process, but as a hiring manager, I’m always worried about turning down good people.

There are sites claiming to use AI to compare your resume to job descriptions and measure how they’re going to do against AI screening tools — I signed up for trials and I didn’t find any of them useful.

Things ChatGPT was terrible at

Writing from scratch

If I asked “given this resume and JD, what are key points to address in a cover letter?” I would get a list of things, of which a few were great, and then I’d write a nice letter.

If I asked ChatGPT to write that cover letter, it was the worst. Sometimes it would make things up to address the gaps, or offer meaningless garbage in that eager-to-please voice. The making things up part was bad, but even when it succeed, I hate ChatGPT’s writing.

This has been covered elsewhere — the tells that give away that it’s AI-written, the overly-wordy style, the strange cadence of it — so I’ll spare you that.

For me, both as job seeker and someone who has been a hiring manager for years, it’s that it’s entirely devoid of personality in addition to being largely devoid of substance. They read like the generic cover letters out of every book and article ever written on cover letters — because that’s where ChatGPT’s pulling from, so as it predicts what comes next, it’s in the deepest of ruts. You can do some playing around with the prompts, but I never managed to get one I thought was worth reading.

What I, on both sides of the process, want is to express personality, and talk about what’s not on the resume. If I look at a resume and think “cool, but why are they applying for this job?” and the cover letter kicks off with “You might wonder why a marine biologist is interested in a career change into product management, and the answer to that starts with an albino tiger shark…” I’m going to read it, every time, and give some real thought to whether they’d be bringing in a new set of tools and experiences.

I want to get a sense of humor, of their writing, of why this person for this job right now.

ChatGPT responses read like “I value your time at the two seconds it took to copy and paste this.”

And yes, cover letters can be a waste of time. Set aside the case where you’re talking about a career jump — I’d rather no cover letter than a generic one. A ChatGPT cover letter, or its human-authored banal equivalent, says the author values the reader’s time not at all, while a good version is a signal that they’re interested enough to invest time to write something half-decent.

Don’t use ChatGPT to write things that you want the other person to care about. If the recipient wants to see you, or even just that you care about the effort of your communication, don’t do it. Do the writing yourself.

For anything where there’s latitude for confabulation

(And there’s always latitude for confabulation)

If you ask ChatGPT to rewrite a resume to better suit a job description, you’ll start to butt up against it writing the resume to match the job description. You have to watch very closely.

I’d catch things like managerial scope creep: if you say you lead a team, on a rewrite you might find that you were in charges of things often associated with managing that you did not do. Sometimes it’s innocuous: hey, I did work across the company with stakeholders! And sometimes it’s not: I did not manage pricing and costs across product lines, where did that come from?

The direction was predictable, along the eager-to-please lines — always dragging it towards what it perceived as a closer match, but it often felt like a friend encouraging you to exaggerate on your resume, and sometimes, to lie entirely. I didn’t like it.

When I was doing resume rewriting, I made a point to never use text immediately, when I was in the flow of writing, because I’d often look back at a section of the resume and think “I can’t submit that, that’s not quite true.”

That’s annoying, right? A thing you have to keep an eye on, drag it back towards the light, mindful that you need to not split the difference, to always resist the temptation to let it go.

Creepy. Do not like.

In some circumstances it’s wild, though — I tried to get fancy with it and have it ask standard interview questions and then, based on my resume, answer as best it could. I included a “if there’s no relevant experience, skill, or situation in the resume, please say you don’t know” clarification. And it would generally do okay, and then asked about managing conflicting priorities, described a high-stakes conflict between the business heads and the technology team where we had to hit a target but we had to do a refactor, and ChatGPT entirely made up a whole example situation that followed the STAR (situation, task, action, response) model for answering, with a happy conclusion for everyone involved.

Reminded that that didn’t happen and to pass on questions it didn’t have a good response to, ChatGPT replied “Apologies for the confusion, I misunderstood the instructions…” and then restated the clarification to my satisfaction, and we proceeded. It did the same thing two questions later: totally made up generic example of a situation that could have happened at my seniority level.

If I’d just been pasting in answers to screener questions, I’d have claimed credit for results never achieved, and been the hero in crises that never occurred. And if I’d been asked about them, they’re generic enough someone could have lied their way though it for a while.

No one wants to be caught staring at their interviewer when asked “this situation with the dinosaur attack on your data center is fascinating, can you tell me more about how you quarterbacked your resiliency efforts?”

My advice here — don’t use it in situations like this. Behavioral questions proved particularly prone, but any time there was a goal like “create an answer that will please the question-asker” strange behavior started coming out of the woodwork. It’s eager to please, it wants to get that job so so badly!

Finding for jobs and job resources

Every time I tried looking for resources specific to Product Management jobs, the results were garbage “Try Indeed!” I’d regenerated and get “Try Glassdoor and other sites…” In writing this I went back to try again, and it’s now only almost all garbage still —

LinkedIn: This platform is not only a networking site but also a rich resource for job listings, including those in product management. You can find jobs by searching for “product management” and then filtering by location, company, and experience level. LinkedIn also allows you to network with other professionals in the field and join product management groups for insights and job postings.

But… regenerating the response amongst the general-purpose junk I got it to mention Mind the Product, a conference series with a job board, after it went through the standard list of things you already know about. Progress?

I got similarly useless results, when I was looking for jobs with particular fields, like climate change or at B-corps (“go find a list of B-corporations!”). It felt frustratingly like it wasn’t even trying, which — you have to try not to anthropomorphize the tool, it’s not helpful.

It is though another example of how ChatGPT really wants to please: it does not like saying “I don’t know” and would rather say “searching the web will turn up things, have you tried that?”

What I’d recommend

Use the LLM of your choice for:

  • Interview preparation, generally and for specific jobs
  • Suggestions for tailoring your resume
  • Help editing your resume

And keep an eye on it. Again, imagine you’ve been handed the response by someone with a huge grin, wide eyes with massively dilated pupils, an expectant expression, and who is sweating excessively for no discernible reason.

I got a lot out of it. I didn’t spend much time in GPT 3.5, but it seemed good enough for those tasks compared to GTP4. When I tried some of the other LLM-based tools, they seemed much worse — my search started May 2023, though, so obviously, things have already changed substantially.

And hey, if there are better ways to utilize these tools, let me know.

Hinge’s Standout stands out as a new low in dating monetization

Hinge’s new Standout feature pushes them further into a crappy microtransaction business model and also manages to turn their best users as bait, and if you’re a user like me, you should be looking for a way out.

I understand why they’re looking for new ways to make money. First, they’re a part of the Match.com empire, and if they don’t show up with a bag of money that contains 20% more money every year, heads roll.

Second, though, every dating app struggles to find a profit model that’s aligned with their users. If you’re there to find a match and stop using the app, the ideal model would be “you only pay when you find your match and delete the app” but no one’s figured out how to make that work.

(Tinder-as-a-hookup-enabler aligns reasonably well with a subscription model: “we’ll help you scratch that regular itch you have”)

Generally, monetization comes in two forms:

  • ads, to show free users while they’re browsing, and selling your data

  • functionality to make the whole experience less terrible

Which, again, presents a dating business with mixed incentives. Every feature that makes the experience less painful offers an incentive to make not paying even more painful.

For example: if you’re a guy, you know it’s going to be hard to stand out given how many other men are competing for a potential match’s attention. So sites offer you a way to have your match shown ahead of users not spending money. If a customer notices that their “likes” are getting way more responses when they pay for that extra thing, they’re going to be more likely to buy them… so why not make the normal experience even more harrowing?

Dating apps increasingly borrow from free-to-play games — for instance, setting time limits on activities. You can only like so many people… unless you payyyyy. Hinge’s “Preferred” is in on that:

49885D72 BAD3 41CF 93E4 ECC1D430F2FF 1 201 a

They also love to introduce different currencies, which they charge money for. Partly because they can sell you 500 of their currency in a block and then charge in different increments, so you always need more or have some left over that will nag at you to spend, which requires more real money. Mostly because once it’s in that other currency, they know that we stop thinking about it in real money terms, which encourages spending it.

One of the scummiest things is to reach back into the lizard brain to exploit people’s fear of loss. Locked loot boxes are possibly the most famous example: you give them a chest that holds random prizes, and if they don’t pay for the key, they lose the chest. It’s such a shitty thing to do that Valve, having made seemingly infinite money from it, gave up the practice.

Hinge likes the sound of all this. Introducing:

83439278 F0EB 4103 9345 B696E7C3F62D

Wait, won’t see elsewhere? Yup.

0C3A38F9 E5E3 494D 885B A1A3A1FE6EF6

This is a huge shift.

Hinge goes from “we’re going to work to present you with the best matches with paid features make that experience better” to “we’re taking the best away and into new place, and you need this new currency to act on them or you’ll lose them.”

If you believed before that you could use the app’s central feature to find the best match, well, now there’s doubt. They’re taking people out of that feed. You’ll never see them again! That person with the prompt that makes you laugh will never show up in your normal feed! And maybe they’ll never show up on Discover!

Keep in mind too that even from their description, they’re picking out people and their extremely successful prompts. They’ve used data to find the most-successful bait, and they’re about to charge you to bite.

EB979C37 2AE3 42F4 B7C3 994D40E844DB 1 105 c

$4. Four bucks! Let’s just pause and think about how outrageous this is. Figure 90% of conversations don’t get to a first date — that’s $36 per first date this gets you. And what percentage of first dates are successful? What would you end up paying to — as Hinge claims to want to do — delete the app because you’ve found your match?

Or, think about it the other way: if Hinge said “$500, all our features, use us until you find a match” that would be a better value. But they don’t because no one would buy that, and likely they’ve run the math and think that people are more likely to buy that $20 pack, use the roses, recharge, and they’ve got a steady income, or the purchaser will give up after getting frustrated, and that person wasn’t going to spend $500. More money overall from more people spending.

If you’re featured on this — and they don’t tell you if you are — you’re the bait to get people to spend on micro transactions. This just… imagine you’ve written a good joke or a nice thing about yourself, and people dig it.

Now you’re not going to appear normally to potential matches. Now people have to pay $4 for a chance to talk to you.

Do you, as the person whose prompt generated that rose, receive one to use yourself?

You do not.

Do you have the option to not be paraded about in this way?

You do not.

This rankles me, as a user, and also professionally. As a good Product Manager, I want to figure out how to help your customers achieve their goals. You try to set goals and objectives around this — “help people’s small businesses thrive by reducing the time they spend managing their money and making it less stressful” and then try to find ways you can offer something that delivers.

Sometimes this results in some uncomfortable compromises. Like price differentiation — offering some features that are used by big businesses with big budgets at a much higher price, while you offer a cheaper, limited version for, say, students. The big business is happy to pay to get the value you’re offering them, but they’d certainly like to pay the student price.

Or subscription models generally — I want to read The Washington Post, and I would love not to pay for it.

This, though… this is gross. It’s actively hostile to the user, and you want to at least feel the people you’re trusting to help find you a partner are on your side.

I can only imagine that if this goes well — as measured by profit growth, clearly — there’s a whole roadmap of future changes to make it ever-more-expensive to look for people, and to be seen by others, and it’ll be done in similarly exploitative, gross ways.

I don’t want to be on Hinge any more.

Learning from uncooperative A/B testers

One of the joys of working at a tiny startup packed into an ill-equipped, too-small space was running an account at Khaladi Brothers, the coffee shop across the street, because all small meetings had to be done outside the office. As the top coffee nerd, I took on running fresh vacuum pots over (and yelling “Fresh pot!” as I entered) and exchanging the empty ones. When we moved to a newer, spacious, swanky, and quite expensive office space (hot tip to startups: don’t do this) with an actual kitchen and drip coffee maker, I was put in charge of deciding which coffee beans we’d order. We had many options and an office of high-volume consumers with strong opinions on everything, and needed to get down to one or two for the recurring bulk order.

Naturally, as a Product Manager, I decided to do selection through a series of A/B tests.

Must-have for the tests:

  • end up with a winner
  • clear methodology, publicly exposed
  • easy to participate — or not
  • takes as little as my time as possible, because this was amusing but everything’s on fire all the time
  • keep momentum out of it (so the first voter didn’t disproportionately determine the day’s winner)

I discarded forced-choice, so coffee drinkers didn’t have to vote if they didn’t feel like trying both or didn’t find a winner, I decided against setting up a dedicated “today’s test” table or doing “three-sample can you tell if one’s different” type advanced testing, I didn’t try to timestamp votes to determine if one did well fresh and one did well through the day… nope!

I went straight single-bracket, winner-advances, random seeding at each round. Every day I tried to get to the office before everyone, and made two giant pots of coffee labelled “A” and “B”. If someone wanted to vote for a winner, they could write it down and drop it in a tin, which I tallied at the end of the day. I will admit that having come out of Expedia, where our A/B tests were at colossal scale with live customers, this whole thing seemed trivial and I didn’t spend as much time as I might have.

You may already see where some of this is going. “I know! I too am from the future,” as Mike Birbiglia says.

It was not trivial, and I ended up learning from the experience.

Test assumptions, set baselines: I didn’t have 32 coffees, which was good, because some days I did an A/A test to see what the difference would be. I was surprised, on those days voting for winners was down, and results were remarkably close to 50%/50% — and the highest split was 58% (10/17), which was a vote off a straight split. 

Know that blind tests mean subjects may reject results or: Starbucks did really well. I don’t know what to say. I figured they’d barely beat out the clearly awful generic ones and Tully’s, but some of their whole beans did well got all the way to the semi-finals. Participants were not happy to learn this, came by to ask questions, and generally were reluctant to accept that they’d preferred it. If a Starbucks bean had won but it had made people unhappy, would I have gone through with ordering it? I’m glad I didn’t have to confront that.

Also… yeah, Seattleites have issues with Starbucks.

Consider the potential cost of testing itself. The relatively small amount of time I thought it would take each day turned into way more effort than I’d hoped. Doing testing in public is a colossal hassle. Even having told everyone how I was doing it, during the month this went on, there were those offering constant feedback:

  • it should be double-blind so I don’t know which pot is which
  • it should have three pots, and they might all be the same, or different
  • no, they’re wrong…
  • it’s easy to come in early and see which one I’m making

…and so on. By week two, getting up early to make two pots of coffee as someone offered methodological criticism was an A/B trial of my patience.

If testers can tamper, they will — how will you deal with it? For one example, I came into the kitchen one day to get a refill and a developer was telling everyone he knew what pot was which because he’d seen me brewing and had poured an early cup off that, and so knew the pot with the lower level indicator was that batch. He was clearly delighted to tell everyone drinking coffee which one they’d picked. I honored the day’s results anyway.

This kind of thing happened all the time. At one point I was making the coffee in a conference room to keep the day’s coffees concealed. In a conference room! Like a barbarian!

I was reminded of the perils of pricing A/B experiments, which Amazon was being called out for at the time — if customers know they might be part of a test and start clearing their browser cookies and trying to get into the right bucket, how does that skew the results? “People who reloaded the page over four times converted at a much higher rate… we should encourage refreshing!”

Think through potential “margin of error” decisions when structuring tests. There was a coffee I liked that dominated early rounds and then in the semi-finals lost by two votes to a coffee that had squeaked by in previous rounds by 1-2 votes each time. What should I have done in cases where the vote was so close? I’d decided the winner by any margin would advance, but was that the way it should have been? Should I have had a loser bracket?

In the end, we had a winner, and it was quite good — and far better than what the default choice would have been — but I was left unsatisfied. I’d met the requirements for the test, it’d been a pain in the ass for me but not taken that much time. I couldn’t help but think though that if I’d just set up a giant tasting session for anyone who cared, and let them vote all at once, I’d have saved everyone a lot of trouble and possibly had a better result.

But more importantly, like every other time I’ve done A/B testing in my product management career, the time I spent on the test and in thinking through its implications and the process helped me in every subsequent test, and was well worth it. I encourage everyone to find places to do this kind of lightweight learning. Surely there are dog owners out there wondering what treats are best, and dogs who would be happy to participate (and cheat, if you’re not wary).

Go forth!

Promotions are recognition, not elevation

Or: the importance of good managers and 1-1s

When I was a Program Manager with no Senior title, I went through a period where I didn’t get promoted, not being promoted made me more and more impatient and even resentful, and that in turn prevented me from making progress towards being promoted.

I’ll paraphrase how I started one of my weekly 1-1s with my manager (Brian Keffeler!):

“Wahhhhhh! Why aren’t I a Senior Program Manager? Look at what I’m doing! It’s amazing! Look at these (n) people who are Senior Program Managers and they aren’t working on as big stuff or doing as well! Wah wah wah!”

And Brian, bless him, listened to me until I’d run out of rant and said:

“I’m not going to argue whether you’re doing better than (person) or (person). Set that aside for a second. None of that matters. You’re not going to be promoted because people look at you and think ‘he’s better than a couple of people who already have the title.'”

I thought “Fuck, he’s right.”

He kept on.

“If you want to be a leader, if you want to be promoted because you’re deserving, you need to stop comparing yourself to them. You need to be so good people assume you’re already in that role. You need people to be surprised to find out you’re not a Senior. When title reviews come up, you want everyone in the room to say ‘He’s not already a Senior? What the hell?’ Right? You want your promotion to be a recognition that you’re already successful operating at that level.”

It was one of the moments in my career where the skies parted, sun shone down on me, and trumpets sounded. I knew immediately that not only was he absolutely correct, that if I was ever to be promoted I needed to prove that demonstrating potential wasn’t enough — that I needed to be operating on this next level. But also, and just as importantly, that me being hung up in the petty bullshit of whether I was the best in my pay-grade and whether I was better than some people in the next pay-grade was fucking up my relationships and career, and that I needed to let go of it.

I might have spent years in that destructive spiral, burning myself out generating my own frustration, with a different manager, or if they’d delivered the message at a different time, or in a different manner.

So I went out and did great work, and people started to assume I was already a Senior Program Manager, and then I got promoted.

Brian’s awesome, and I owe him a great debt.

Honesty without obscenity

When I was a Program Manager at Expedia, and Aman Bhutani had just showed up to right the ship through by demonstrating the value of clear leadership, he started a regular “Big Boulders*” meeting with the Program Managers working on the most critical projects, like the giant re-platforming, or new shopping paths, or rethinking the checkout process.

He wanted to get direct feedback on what was going on, unfiltered, and to discover where he could help. We’d show up and give a high-level status using a standardized couple slides showing timelines and dependencies, and if Aman could help by raising an early point of emphasis to another of his peers about a cross-organizational dependency that had historically been trouble, we’d ask.

Aman built trust with us by delivering — if you brought something up that concerned you, and he said he’d go look after it, you could check it off your list of worries.

For us Program Managers, to have his ear and direct engagement was a huge step forward, though dangerous because we didn’t want to report status to him that we hadn’t already talked to our managers about (because at that point we hadn’t entirely recovered from the stabby years). And it was also pressure-filled. Not just because he was there, or because he’d ask amazingly insightful questions you wanted to be prepared for (and to which “I had not thought of that solution, wow” was a perfectly good answer). In front of a peer group of others trusted to deliver the most important projects, you wanted to have your shit together.

Some people didn’t deal with all of this well (each time starting with a forced grin and “It was another great week on the ____ team!”) but in general, Expedia’s Program Manager corps was a lot of no-credit-taking, jump-on-the-grenade, jaded leaders-through-delivering who’d kept at it through some dark years because they believed in the mission, and they’d be honest. But also, still, sometimes you left the door open knowing he’d ask a question, because you didn’t want to volunteer something you were worried about that your boss wasn’t, but it was keeping you up at night**, and you wanted him to know.

After the initial progress, Aman wasn’t satisfied with a true but also wary status report. So at one meeting, he challenged us. He wanted to hear the status with our insights, whatever they might be, into the present and future, no matter how dangerous the truth seemed.

I felt excited that for the first time someone way up the chain was not only recognizing the chain itself distorted and delayed truth, and he wanted to try and bridge that. And because we’d built so much trust, we were safe — it wasn’t a trap.

So off I went.

“We are so fucked,” I started, and I took off from there. “This org is fucking us, this other thing is fucked up, but this team is fucking amazing, totally saved our ass. This thing we bought from a vendor to help is a piece of shit…” I just went the fuck off, running down everything in terms that would have made a stub-toed sailor tell me to calm down.

Aman nodded through the whole thing, entirely even-keeled. When I was done, he said “So first, yes, that’s the transparency I’d like to see.” And then he paused for just a moment and said “But I’d suggest it’s possible for us to get that honesty without the obscenity.”

I felt relieved, and also like I could do better***.

He let that hang out there for a comfortable pause, thanked me, and then we moved to the next person.

It was an important step for me in how I expressed myself, taking this challenge to be concise, and true, and also not angry. Because I realized that while you get some truth in the emotion, you also lose clarity. “Fucked” expresses frustration, but does that express a need or problem to someone who might help you? And for many people, if you’re cursing like crazy, or you’re coming across angry, they’re not going to receive the message at all — and when you’re speaking, sometimes you can’t expect the audience to come to you, and if you want the right outcome, you’ve got to deliver in a way that’s most effective for them.

Afterwards, the Big Boulders meetings got way more raw, without the cursing, and we got to the next level of trust. And that led to things overall improving, and I felt like I’d contributed in some small way to taking that step forward. And taking Aman’s advice, I start trying to consistently hit that level of openness and honesty in all my communication, without the cursing.

DMZ

  • first you figure out the boulders, then you see what rocks you can cram in around them, and then you pour in sand until the container’s full

** if you’re sleeping well, you’re not paying enough attention to your project. It’s why we’re all such coffee fiends

*** the ability to support people while also helping them realize they can — and want to– do better being one of Aman’s super powers

My first job at Expedia, joining a small crack team who all seemed wildly smarter than me* my manager was Tim Besse**. Once I was stuck on a particularly thorny problem over a bit of UX, and he stopped by to help. We brainstormed, we drew all over the whiteboards in my office***, we argued, we revised and we came up with something that solved the tangled issues to everyone’s satisfaction.

Relived, I went to write the whole thing up. Tim, standing back from the whiteboard, shook his head and frowned.

“No,” he said. “This isn’t good enough. We can do better.”

I felt anger, frustration — we’d finally come up with a way out and he wanted to discard it? We both had a long list of other things we needed to figure out. Checking this off and moving on was a huge relief and a victory for everyone.

I looked at him in dismay while he stared at the diagrams. I took a couple deep breaths and let go of the frustration.

“Okay,” I said. “Where do we start?”

We began again. I remember it as taking twice as long before, in wavy boxes with my chicken-scratch handwriting everywhere, we’d found something wildly better in every way.

We looked at each other and smiled. I felt a sense of rightness and satisfaction I hadn’t touched in the previous one.

I’ve carried that with me since: that when you’ve arrived at something that’s good enough, push on it a little. As much as I pride myself on being pragmatic above all else, push on good enough. Does it rattle a little? Is there a little give? Do you feel like there’s a hidden switch that’ll rotate the whole thing?

Take the time. See if you can turn good enough into something amazing. Challenge others to do better.

And believe that when someone says “we can do better” they believe it, and that you can.

Thanks, Tim.

— DMZ

  • In the words of Isaac Jaffe:
    “If you’re dumb, surround yourself with smart people. If you’re smart, surround yourself with smart people who disagree with you.”

** Tim went on to co-found Glassdoor

*** shared. As a Microsoft spin-off, we were all about private & shared offices. It was great! Then they abandoned it and I’ve never since enjoyed such productive work spaces.

Flowcharting cheat sheet

How to go from sketching boxes to producing clear and consistently readable flowcharts, in under 500 words.

My team came across something like this online:

flowchart-no-no

It started a discussion on learning the most basic guidelines for making a good flowchart. I volunteered to write this and share it now in the hopes it’ll help future generations.

Using these will help not only make flowcharts more readable, by being consistent you’ll more easily find errors and things that are unclear in the flow you’re documenting.

Cultural note: this assumes you’re in a language/culture that reads left to right, top to bottom. Adjust as you see fit.

Direction matters

Overall, for chart flow

Reduce effort by flowing as your audience reads: left to right and then down. The chart as a whole all the way down to each object: arrows come in from the left and exit from top/bottom/right.

If you can’t do left-right for the chart (or an object’s connection), top to bottom’s 2nd-best.

dense L to R

Don’t go snake-style:

snake-style

Direction matters in decisions

Yes/No or True/False should go in the same direction each time they’re on the chart. Anything else creates confusion and possibly someone making the wrong choice.

Generally, I’ve found that the positive (“Yes”/”True”) is most easily read if they’re the up in up/down and right in left/right, but as long as you’re consistent it’ll be okay.

Sizing matters

Attempt wherever you can to keep the boxes a consistent size, unless the difference in sizing carries meaning.

Spacing matters

Keep the amount of space between symbols as consistent as you can. If you can, line up things of the same type, like decisions and conclusions, especially if they share something (for instance, they happen at the same time).

Decision boxes

Use them, they help immensely. Two ways to do this.

Recommended: diamond with annotated lines

diamond choice

If possible, put the labels right next to the decision — don’t make people search for what the decision is. They should at the decision point know the answer to the question and be able to immediately know which line to follow.

More readable for some people: diamond with answers. Requires the reader to scan all the landing points for the answer, and making the ‘answers’ obvious might require use of shapes and colors, resulting in more complexity. Still, if you prefer:

decisions as boxes

You will note that this is helped if you’ve already set the viewer’s expectations about which direction is which.

Okay, so let’s see this in practice

Take this:

flowchart-no-no

Applying only the suggestions here and a couple minutes of cleanup, and noting that there’s at least one problem in the flow there that’s concealed by it being a mess:

first pass cleaned up

If I put both of those in front of someone and asked them to follow through the decisions, it’s now much easier to read and figure out what to do.

Good flowing

Let me know if this helped, or if there’s more simple, easy-to-apply guidelines I should include.