/
/

Always in working.

Let`s Connect

< Return

2023-04-28 · Anwari Fikri
Dev Log
/assets/blogs/building-my-first-discord-bot-2/borneo-bulletin.jpg

Building my first Discord Bot (2/4)

Last week, I made a blog post on my new project, which is to create a Discord Bot that scrapes articles from Borneo Bulletin and summarizes them. Here’s an update on the progress of it. To summarize, I have a somewhat working scraper and I have integrated this scraper into my discord bot which I have deployed on a virtual private server. I am now working with a friend to build this Bot, and he is working on the summarize feature. Of course, I have faced some difficulties along the way and I will discuss them on this post

I can say that I am halfway done with creating a minimum viable product (MVP). I will show my day-to-day progress and explain a little bit about them.

Day 1

I spent the first day writing my introductory blog on the project and doing a little research on how to create a Discord bot. No development is done during this day.

Day 2

In my previous blog, I mentioned that I want to treat this project like a real software development project. So on day 2, I started by writing a requirements engineering document on google docs, with the help of ChatGPT. Then, I began planning the development timeline on Jira Software. Listing tasks helps me prioritize which tasks should be completed first and identify optional features that are still beneficial to include. This is valuable to me because I often spend too much time considering whether to add new features that aren't explicitly mentioned in the requirements.

Jira Software is one of the most popular project management tools designed to help teams plan, track, and manage work.

Jira timeline for Borneo Bulletin Bot

I began working on the scraper. My first plan was to use Scrapy which didn’t go well. The first problem was how difficult it was to use Scrapy. There are so many foreign concepts in Scrapy that made me feel overwhelmed like spiders, middlewares, pipelines, configs, etc. Another problem is that the Borneo Bulletin website articles are loaded from JavaScript which means I can’t just simply scrape the page as it is. I need to wait for the page to load the JavaScript first, then I can scrape the articles. I believe there is a way to do it with Scrapy but instead, I just changed my approach by using a different tool called Selenium.

Me when I change from Scrapy to Selenium

While Selenium is not designed for scraping, it can be used as one. Selenium is used to automate browser interactions like clicking buttons or filling out forms and is very easy to use. You can use Selenium in a logical way for example:

  • Go to the headlines URL
  • Find the article URLS that are tagged headlines
  • Follow the URL link
  • Grab the title, author, date and the article content
  • Repeat until all links are crawled

As it simulates actual users using a browser, so you don't have to set up things like a user agent or proxy which makes scraping less complicated and less frustrating. With a little bit of trial and error, I was able to get the URL of the article that I want to scrape for Day 2.

Day 3

Since I already have the URL for each article, I can start scraping the page. Nothing too complicated here just needs a lot of trial and error again to scrape the right content. I also managed to scrape the image from the article as well.

I also invited a friend to work on this project together and he will be working on the GPT-3.5 summarization. While he is working on the summarization, I can now start working on the Discord bot. I only get to work a little bit on the bot on day 3.

Day 4

Getting started with Discord bot is surprisingly easy as there are a lot of resources online and how easy the syntax is. You can run a very simple discord bot that says “hello” back when you send “hi” in the chat in less than 50 lines of code.

Bot sends 'hi' when any user says 'hello' in a Discord text channel

One thing to consider when creating a discord bot is whether to use discord.py or Nextcord. Both of these functions are the same but Nextcord is said to be a modified version of discord.py which includes additional features. While this may sound compelling, I still chose to use discord.py because discord.py has more members in their discord server with over 40,000 members while Nextcord only has fewer than 3000 members (as of the time of writing this).

I began to modify the starting code to include the features from my scraper and I managed to create a bot that returns a list of headline articles for that day when I type “whats brewing today” in the channel. A lot of trial and error during this process and I also asked and researched a few questions from the discordpy discord server that I joined. This is when I realized how useful it is to join a discord server for the library that you are using.

Bot sends headlines articles for the day when user says 'whats brewing today'

Since I have some spare time for the rest of the day, I researched on how do I deploy the bot so it can run continuously without me having to run the bot locally on my local machine. I figured that I need to have a Virtual Private Server (VPS), which is basically a remote computer that I can access with my local computer. VPS doesn't have any graphical interface and I can only interact with the bot through a terminal from my computer.

After researching several providers, I decided to buy a VPS from RackNerd, the cheapest VPS provider I could find. It costs around $30 BND per year (equivalent to $2.50 BND per month). Although the server is located in the USA, it is not a big deal for me as I just need to get my bot running continuously.

Day 5

Watched some tutorials on how to deploy a discord bot on a VPS. Learned to access the VPS through PuTTY and the rest is just a matter of uploading the files to the VPS and running the code. Had a small hiccup as I tried to deploy my bot, with errors from the code on my side. Ultimately, I just need to add —no-sandbox and run Selenium in headless mode. Now I have my discord bot up and running 24/7.

Bot running on Virtual Private Server (VPS)

Afterword

I have a lot completed during the 5 days, but I am still not done. I realized that my scraper is actually not scraping the content that I want i.e., it only scrapes some of the headlines but not all. My bot also needs some work on the structure of the file and also work on one of core feature - run the scraper at a certain time of the day.

I have 5 more working days to work on this and I believe I am able to finish this project within the time limit that I set.

Random Blogs

Contact me

anwari.fikri@gmail.com


© 2023 | Anwari Fikri