Skip to main content

Mike Kreuzer

Ripley - now with added Elixir

August 21, 2016

I've been scraping reddit programming languages subreddits since April, ranking them by number of subscribers, and the results are up on the Ripley site. This morning I rewrote the scraper that generates that data in Elixir; originally I'd written it in Go.

The Elixir code's still in a branch of its own, waiting until I write some comments and some tests. Though thinking about it now the Go version's free of both of those too, so this needn't stay un-merged for long.

I planned on writing a "6 months with Elixir" sort of a post, and still mean to, but I thought some code might be better. I'll add this observation here though: I wrote that Elixir solution twice. The first one was a set of piped tasks, and when I finished it it felt like the most Elixir thing I'd ever seen. It ended something like:

Application.fetch_env!(:scraper, :subreddits)
|> Enum.map(&Task.async(Scraper, :scrape, [to_struct(&1)]))
|> Enum.map(&Task.await(&1, 60000))
|> Enum.sort_by(&(&1.count), &>=/2)
|> write_file

But of course, it wasn't. The most Elixir thing. Not really. Because that had no supervision of the separate processes, and at the six month mark of my Elixir journey supervision's what Elixir means to me.

At six months I'm still excited to see just how deep this rabbit hole goes.

Update November 2023: I've taken my code off Github, this code's no longer available there. The Ripley site no longer exists.

Tags: