A Story of Many Failures and One Success

In this blog, join me as I embark on the complex journey of semi-automating an AI-focused weekly newsletter. You’ll get a first-hand look at a diverse set of tools and tactics that I’ve put to work on this real-life puzzle.

Ready? Let’s go!

At first glance, the mission seems quite simple: select 5 to 7 of the most exciting AI news items from the past week and compile them into a newsletter. This task has traditionally been a wholly human endeavor. But, given the sheer volume of AI news these days, it could easily become a full-time gig for a small crew. It requires a tremendous amount of time to filter through all the news, handpick the most engaging ones, and then craft an enthralling newsletter. Let’s be real, I don’t have that kind of time! So, the only feasible option is to automate whatever parts I can. 

To give you a bit of background, let’s travel back to the early days of my career. I was once a full-time software developer, spending my days writing code from 2011 to 2017. Then I transitioned into a developer advocate role, where coding remained, but it was confined to smaller tasks like building tutorials. By 2020, I had assumed the role of COO at AI Infrastructure Alliance, which meant focusing on operations and coding drifted to the background.

So, here I am now, more of a casual coder. I still have my foundational software development skills intact, but I’m not exactly up to speed with the current landscape of programming languages and their libraries.

But with today’s Centaur tools that doesn’t matter.  

If you can think it, AI can help you build it.

Is AutoGPT all you need?

Full of optimism, I began my journey by testing one of the most talked about solutions on the planet: AutoGPT and its counterparts. YouTube and LinkedIn feeds were overflowing with thrilling posts and videos, showing a world where AutoGPT does all the heavy lifting while you relax with a cup of coffee and pet your dog.

First, I picked the most accessible tool: AgentGPT. All it needed was an OpenAI API and it promised to do a complex series of planning and tasks at the push of a button. Too easy. Let’s do this. I gave it a simple mission:

Find the most important and exciting news about AI in the last 7 days; provide links to sources and short summaries

It was kind of fun to watch it break down the task into several subtasks, like visiting top tech news portals and skimming through the latest news. But then…

As an AI language model, I do not have access to the internet and cannot perform web scraping. However, I can provide a general response based on recent developments in the AI industry.

Well, that was a bummer. After all the hype about AI agents in my feed, I certainly didn’t expect it to fall short in web scraping. So, what’s its actual utility, then?!

Undeterred, I decided to try another contender – BabyAGI – and… 

Same deal. Interestingly, it managed to be even less helpful – it was without internet access, but it had a vivid imagination, generating a multitude of links and summaries. Sadly, none of them hit the mark.

However, I wasn’t about to throw in the towel just yet. Rumor has it that AutoGPT does have access to the internet. Well, let’s put that to the test!

On a positive note, AutoGPT is pretty easy to boot up on your desktop, and it indeed has the chops to scrape the web. I threw it at this task at least 5 times (and tried it on a slew of other tasks, too, in the hope of identifying a use-case where it outperforms; spoiler: I came up empty-handed).

The way AutoGPT operates is somewhat different from the other two tools: you first define what your agent is assigned to do, then establish up to 5 goals for it. It formulates a range of tasks, and after each task, it can pile on more, and also invite your feedback if you wish to adjust something.

Let’s walk through a couple of my attempts.

Newsmaker is: an AI designed to read a lot of news and find the most interesting, valuable, important, and exciting ones about AI.

Goal 1: Read the list of links from a file “links.txt”, visit each of these links, find all links on those pages, and write them into a new file “found_links.txt”.

Goal 2: Visit each of the links in the “found_links.txt” file, summarize the text in each of these links, and decides if this is an interesting article about AI or not; if it is an interesting article, output the link, a short summary and a short comment on why it’s interesting for a potential reader.

Goal 3: Collect all interesting links with summaries and comments in the “interesting_stuff.txt” file.

As you can tell, I decided to lend a hand by supplying a list of links to scrape (the news sites). But it fumbled right from the get-go: dealing with files. Despite having access to the file system, for some inexplicable reason, it chose to open “found_links.txt” first. And crashed. What a shame.

That got me thinking: maybe we should let it off the leash a bit?

Newsmaker is: an AI designed to collect and summarize the weekly news about AI.

Goal 1: Identify the most reputable sources of news about AI.

Goal 2: Find the news about AI for the last 7 days.

Goal 3: Identify the most popular news about AI for the last 7 days.

Goal 4: Provide a list of those news with links and short summaries.

But sadly, no. AutoGPT did locate a few websites from which to gather news, but then it made a somewhat strange choice to directly ask them for “popular news about AI”.

To add to that, AutoGPT was in a bit of a raw state when I conducted my experiments, and it tended to encounter problems quite often for various reasons. One particularly frustrating issue was its failure to communicate effectively with its own subagents.

However, the most standout issue common to all three tools is their never-ending nature. I never quite managed to see all tasks come to fruition. What I wanted as a finish line. Give me a list of news stories and summaries. It just never got there. Like the Energizer Bunny, it kept going and going and going. Indeed, you can extract some valuable insights from the task execution log, but usually, the system keeps adding more tasks after each one is done, giving you a constantly growing list that never seems to end.

So much for AI taking over and doing everything for us. The magic demos ended up being a bit of a mirage. 

What now?

Enter Telegram bot: The perfect Centaur companion

After the earlier setback, I opted to start from square one and stick to the conventional software development methods. That is, we first break down the task into small, manageable chunks and deal with them one by one.

For this challenge, the typical human approach unfolds like this:

  • Hunt for a wealth of AI news.

  • Digest all of them.

  • Cherry-pick the most captivating, intriguing, or consequential ones.

  • Pen compelling summaries.

  • Merge the summaries into a cohesive newsletter.

  • Review and dispatch.

In a perfect world, I’d want a machine to handle all of this. If I could create a system that sends a personalized newsletter to my inbox every morning, featuring the most critical news from the past 24 hours, I might finally be able to quell my FOMO.

Initially, I invested some time attempting to teach an AI to separate exciting AI news from the rest, with the goal that it could autonomously scrape news websites. However, success remained elusive within a reasonable timeframe, so I opted to prioritize more achievable tasks.

To jumpstart the process, a friend of mine suggested an alternative: manually gather links via a Telegram bot, then use a combination of conventional code and AI to select a few and craft a newsletter. After all, we’re already reading a ton of fascinating stuff and sharing it amongst ourselves, so why not broaden our sharing circle?

So, I dived into a few tutorials about creating Telegram bots, and I quickly realized that I needed a backend app running continuously. The easiest way to achieve this was by using Replit

Let’s take a moment to appreciate Replit: I’m genuinely amazed by their achievements and the progress they continue to make. They’ve brought to life an online IDE – something that would have seemed borderline inconceivable a mere five years ago. They’ve done away with the need to set up a local environment or wrangle with AWS deployments, and that’s just the tip of the iceberg. They’ve even made coding on your smartphone a reality! Oh, how I wish they’d been around four years ago when I was completely consumed with my newborn and had a burning desire to code something purely for fun!

But they’re here now, and it’s time to put them to work.  

Replit delivered three key features for the project:

  • A vast compilation of templates: Bootstrapping the Telegram Bot was as simple as searching for “Telegram” in the templates library. Just like that, in a matter of minutes, I had a live sample bot to tinker with.

  • Immediate hosting of my script as a service. 

  • Ghostwriter: This is Replit’s coding sidekick. It comes in various forms: It suggests possible next lines of code as you write; it supports a few context menu operations (such as transforming a piece of code based on a plain English request); it responds to your queries in a chat window; and it even lends a hand in debugging exceptions.

The first two are fantastic. As for Ghostwriter, it’s pretty good, but it’s still coming along.  It’s not fully baked yet. For instance, when I asked if it could access the internet and read the documentation, it confirmed it could. However, when I shared a link, it produced some wildly imaginative responses without actually reading the link. The dreaded LLM hallucination issue. Despite all this, I’m betting it will get better in the not-too-distant future, so give it a try.

My first task was to create a bot that gathers links into a simple database. As you can probably imagine, there’s not much to elaborate on this aspect. Thanks to the support of Ghostwriter and occasional input from ChatGPT, I was able to make quick work of it.

Let’s dive into it.

Summarizing news articles

This segment appears fairly simple at first. But as always, the devil is in the details.

The straightforward part is summarizing a chunk of text. There are several ready-made solutions out there, like LangChain’s summarizers. While you have the option of using a preset prompt, you can also compose your own, which is the route I took. I preferred to receive a list of bullet points as opposed to a continuous text, so I ended up crafting something along these lines:

The tricky part, however, lies in obtaining the text, to begin with. If you’re examining a news article, you’re in luck – you can scrape the HTML, get rid of the tags, and feed it to the LLM. But if you’re looking at a tweet, a Reddit thread, or an Arxiv article, it can get tricky fast. Well, I figured, for the sake of proof of concept, it’s reasonable to assume we’re dealing with news and blog posts. I’ll tackle the rest later.

Selecting the best news

The next hurdle was deciding on the most engaging, intriguing, and valuable news to include in the newsletter. The main challenge here is the inherent difficulty in defining what makes news “engaging, intriguing, and valuable.” For the sake of time, my friend and I decided that this task could be performed by a human. Hence, I added commands to select and deselect news. So far, it was smooth sailing.

Writing a newsletter

Some might argue that this step is a breeze since we could simply repurpose the summaries we already have. However, these summaries are rather dry and formal, whereas we need something genuinely engaging and easy to digest. 

Consequently, I decided to utilize guidance – a recent library from Microsoft Research that allows the creation of intricate templated requests to LLMs. The plan was to request the AI to generate three alternative versions for each section of the newsletter, as well as its title, for each chosen news item. I drafted a rather terrible request, which my friend, an experienced writer of over 20 years, extensively refined. This is what we ended up with:

So, upon executing a Telegram bot command, we obtained a PDF preview and a DOCX file brimming with text. The only task left for a human was to select the most appealing writings for each section and do a bit of copy-pasting.

Are we all set to launch? Hold on just a second…

Selecting the best news… Again

One late-night, as I was contemplating the tantalizing idea of using AI to cherry-pick the best news for our newsletter, inspiration struck. The stumbling block? We can’t just shovel 30+ summaries into AI’s lap and expect it to serve up the 7 most riveting ones on a silver platter. I mean, I couldn’t whip up a prompt that would do the trick accurately. And don’t even get me started on rating each piece on a scale of 1 to 10! Defining scoring criteria was another puzzle.

Hold on a minute, though. In AI’s defense, we humans would struggle with these tasks too. That’s when I recalled an RLHF approach – present a reviewer with two pieces and ask them to select the best one. It’s a bit subjective, sure, but repeat this enough times, and the results become pretty dependable.

Inspired by this, I rolled out a multi-elimination tournament for news: each round (two to four rounds, depending on the number of news items), I paired up the news with the same score (starting from zero) randomly. The LLM was then asked to choose the best of the pair. The winner would earn a point. Rinse and repeat until the end, and voila – we simply return the news items with the top scores.

And guess what? We made it!

The Sky’s the limit

And that’s it, folks! I successfully set up a process where nearly everything runs on autopilot, except for the gathering of news and cherry-picking the best texts. We still have the human in the loop, doing what humans do best, giving meaning to information. That’s the same insight that Google had with the original Page Rank system. Let humans do what they do best and machines do what they do best.  

If you’re interested in seeing the fruits of this labor, you’re more than welcome to sign up at AI News Now.

Before I sign off, I want to leave you with a thought that’s been buzzing in my head:

This bot serves as a testament to the exciting times we’re hurtling towards – the dawn of a new era, the era of Centaur software and intelligent assistants.

It’s possible now because we now have LLMs that can streamline heuristics in a way that traditional methods never could. But let’s not forget our roots – the traditional methods. Good old-fashioned task planning and hand-coded logic. Remember, there’s no need to call in the AI cavalry when a simple regex will do the trick!

I’d love to hear about your adventures with AI in your work. Drop a comment below, and may the Centaur powers be with you!

This blog has been republished by AIIA. To view the original article, please click HERE.