Quick Thoughts from the Green Goblin of Generative AI

Okay, I’m going to let you in on a secret. One of the reasons I’m so frosty about the gap between generative AI’s narrative, and what it actually can do when you build an application with it and put someone in front of it is that for the last couple of months, I’ve been building an application with it.

*(OHHHHH YEAH, I’m gonna be that guy for a minute.)*

Does that make me a giant hypocrite? Maybe! But I don’t think so, because again, the main reason I have a bucket of scalding hot takes on the subject is not because I have a negative perception of Elon Musk or whatever. It’s because I’ve spent a good number of nights and weekends being personally let down by the capabilities of this stuff, not from the perspective of someone making silly pictures with DALL-E, but from that of someone trying to power a solution with the alleged precursor to the singularity.

… not that I haven’t made silly pictures. I have. Have you seen my album?

To be clear, I don’t think making a (kind of crappy, at the moment) app gives me some kind of veto power over any generative AI conversations. But… if I’m being completely honest, I don’t think most people who talk about generative AI have even done this much. Because if they had, I think we’d be having a lot more interesting, reality-based questions about where this space is going, and/or what we should be empowering developers and end-users to do with it.

So here are my observations.

1) Development is very rewarding, very early

My idea was much less aspirational than the world’s best what-if scenario builder. I was doing some mindless corporate freelance work, and realized that I was literally transforming text, manually. Not writing, not creating — transforming. Except, I had some requirements, and it didn’t make any sense to keep typing them into a prompt. I needed to manage a hierarchy of information (what I am talking about), and a parallel structure of “physical” requirements (the length/form/structure of various instances of that information), and I needed to be able to quickly mix and match the two. For instance, give me two paragraphs and a bulleted list about this product’s security features.

It took me about 5 minutes of thinking and about an hour playing with Bubble.io to realize there was very, very little keeping me from actually making something like this if I could just handle some of the automatic, programmatic prompt generation. I texted my Dad, and within a couple days he was writing Python scripts I could summon (they live in England — the scripts, not my Dad) from Bubble, and we were generating content — not good content, but content, dammit — programmatically.

This was alarmingly easy, and to pat myself on the back, this wasn’t even one of those pure “wrapper” apps where a bunch of canned prompts are connected to buttons. We were mixing and matching a lot of stuff here, and I built management tools for all of it. Good job, Nate.

*(My love of basic information software UI is… weird, I’ll concede.)*

Now look, I’ve been making fun of various forms of AI content forever. But even I got a runner’s/developer’s high off this thing working. It’s exciting. You feel like you’re incredibly close to solving the problem you set out to solve, because you’re 80% of the way there in like, two weekends. I think this is a non-trivial factor in why people are so obsessed with this stuff — there is always demand for the kind of App Store 2009 gold-rush feeling among entrepreneurial folks, and this is probably the first tech trend to make it feel sooooo tantalizingly close.

2) There are way too many black boxes

My greatest concern with LLMs and generative AI in general is that the black box nature of it is being treated as a feature (“just type!”) when it’s actually a fundamental limitation no one has any intention of solving because it’s impossible. Tapping into OpenAI’s API has not made me feel any different. Now, you could push back and say “well, you didn’t build your own model!”, but I think that’s just kicking the can a little further down the stack. What does training the model mean? Whether it’s prompt –> output, or training data –> model, the real question is whether we’re just throwing this stuff in a blender and then giving an awkward thumbs up when plausible results come out. Maybe someday I’ll build my own model and that’ll change my mind, but I’m skeptical.

3) The black boxes might be on purpose, a.k.a., “There’s no Step 3!”

Great ad, first of all. Jeff Goldblum should do more ads. Anyways, like I said before, I have this nagging feeling that this technology, with its current fundamentals, is going to struggle to dazzle people without the ambiguities of a chat interface. Yes, it gives the model the input. But there are lots of ways to tell a computer something — we used to tell it to delete things by typing, and most of us don’t do that anymore because it’s insane. What natural language really does is obfuscate. What are you getting? Why are you getting it? Pretending a computer is a not a computer can feel magical, but one advantage it gives the computer is that for some reason, it causes people to stop holding it to the standards we would ordinarily hold a computer to. And these are not arbitrary standards — computers that don’t meet those standards are useless, because they are simply incredibly fast chaos machines that can’t be relied on to do anything. They have no judgment. So go ahead, use controlled randomness to generate interesting results. But never make that randomness unauditable or inaccessible. Except… they did! And that’s where today’s magic comes from, but I think it’s unsustainable as a fundamental of a generally-impactful development platform.

4) Context doesn’t scale… at all

I’m an objectively poor software architect. I get what should happen, but I’m pretty bad at making it happen, and I’m not disciplined enough to ensure it happens (just like an LLM!). But I’ve been in technology for twenty years, and I know how a lot of stuff basically works, or more accurately, how it doesn’t work. But even I was surprised at how limited the options are for helping your “AI-powered” application maintain context. One option is just to keep feeding it the entire history of the “conversation” each time you ask it something, which is completely insane, and quickly becomes wildly expensive. I’ve been messing around with OpenAI’s ability to parse documents as a way around this, because my application doesn’t need the context of the back and forth between me and the bot — I am just trying to use structured information to generate better output. But it’s hit or miss, and the pricing on all of this is totally up in the air, so who knows if my approach is even economically viable. I guess I will, soon.

This one bothers me the most because content is much less interesting to me than context. Context is everything! It’s why Siri is still so dumb. So when I, like a million other tech dorks, first stayed up all night chatting with ChatGPT for a couple days, the thing that really stuck with me was the context. I told it entire worlds of nonsense, making up fictional biographies of my friends and neighbors, and it maintained that context through a bunch of other requests and conversations, for weeks! That’s what really got me excited about this — I thought we had figured out how to maintain context in some incredible, invisible way. But… I don’t think we have. I think we’re just shoving entire transcripts into this thing over and over and over again, until we can’t, and we stop, and context is lost. Bummer.

I mean, hey. I’m having fun. If nothing else — and this is to Point #1 — there is something magical about building applications, but making an application DO something is beyond the skills of a lot of wannabe developers like me. LLMs take a bunch of work-like activity, throw it behind an API, and let you trigger it programmatically with your crazy UI ideas, and I’d be lying if I said I didn’t get a huge kick out of it.

Now, do I feel capable of replacing people’s jobs with this technology? No, not at all. And sure, you could say “well you won’t be doing it, Nate, a team of real developers, potentially with a foosball table, will be doing it”, and you’d be right. But the people telling me about everyone being replaced by robots aren’t those people. They are people who don’t even do what I’ve done — they are just (maybe) poking at ChatGPT and extrapolating not with a low-code platform, but with their own overactive imaginations.

Not me, pal! I come from the tech school of hard knocks. I’m under the hood, building artisanal AI applications by hand and developing a fine, nuanced palette for all things prompt-related. So I’m wise to your games. I’m not some startup rube about to be played by a bunch of AI-sharks looking to sell shovels in a pointless, hype-driven gold rush!

Ooh, that reminds me, I have to go pay my OpenAI bill.

NATE SULLIVAN DOT COM

Quick Thoughts from the Green Goblin of Generative AI

1) Development is very rewarding, very early

2) There are way too many black boxes

3) The black boxes might be on purpose, a.k.a., “There’s no Step 3!”

4) Context doesn’t scale… at all