What’s going on with Steam Spy?

Sergiy Galyonkin
Sergiy Galyonkin’s blog
4 min readApr 27, 2018

--

After multiple websites reported that Steam Spy is dead, I think it is time to discuss what is actually going on.

What happened?

As you might know, on 11th of April, 2018 Valve has made changes to their Steam Web API, removing owned games from user’s information, unless they actively opt-in.

Many people, myself included, at first attributed that change to GDPR compliance, but it doesn’t seem to be the case. Valve still exposes your real name, achievements, groups, screenshots, and friends by default and still hasn’t updated their EULA to comply with GDPR. It doesn’t mean they won’t, but the API change was definitely not caused by that.

Steam Spy used to rely on polling user information through that API. Naturally, the old algorithm stopped working, and while I was busy creating a new one, some press rushed to pronounce the site dead.

To make matters worse, Steam also changed their Store API, making it useless. Store API contained basic information about games on Steam — prices, release dates, genres and so on. Steam Spy (as many other sites) relied on that as well. Like the previous change, this one wasn’t caused by GDPR compliance either — it’s hard to imagine Valve protecting the store’s privacy.

What did Valve say?

Nothing, as usual.

I wrote a proposal to Valve that would still let Steam Spy run using the old algorithm without exposing any personal data. I’ve got a confirmation from Tom Giardino that they’ve received my message, but that was it.

To be honest, that was already more than they ever said to me before.

Why keep it running?

To coincide with these two blows from Valve, we experienced some major workload at my day job. The game I’m working on was down for an extended period of time, and I didn’t have time to work on Steam Spy.

However, during that period I received over two hundred emails and messages from developers telling me how Steam Spy improved their lives. There was an indie company from Berlin that managed to secure financing from the government for their niche title because they had the data to prove that this niche is big enough. The title got released and succeeded.

Then there was a successful mid-sized publisher that entered the business after it was able to see which games are selling and which don’t.

And then there were your usual stories of developers being able to navigate the space because they knew how the market behaves now.

So, after a very stressful week at my day job I decided to try a couple of new things with Steam Spy.

How does Steam Spy work now?

In my previous life almost 15 years ago I was writing a Ph.D. thesis on predicting economic outcomes based on accidental data that might be irrelevant to the predicted results using machine learning.

The trick with machine learning algorithms is they’re fantastic at solving categorization problems (“Is this a man or a cat?”) but suck at regression problems (“Product A sold 10,000 copies, B — 20,000, how much did C sell?”). The idea is that a specially prepared data with a modified algorithm can work a bit better for problems like that.

I never finished that Ph.D. (was poor as a church mouse and couldn’t afford that), but now with the recent changes to Steam API, I decided to give it a try. Thanks to the power of the Internet, I have a ton of coincidental data on games, and most of it doesn’t come from Steam.

Well, guess what, it kinda works!

Frostpunk devs just announced that the game sold 250,000 copies and the new algorithm estimated it at 252,000 copies. Pretty close, right?

Does everything work like before?

Unfortunately, no.

I’m still rewriting the site to move to the new math model. Many features of the site are still unavailable, but most of them will be coming back.
For example, I’ve temporarily removed country stats. I wasn’t happy with them before, and now is a good chance to rework it into something more usable.

On the other hand, the basic features (the number of owners, playtime distribution, related games) are working fine already. API is also back, although in a somewhat limited form.

How accurate is the new Steam Spy?

Not very accurate, to be honest.

I have the data for around 70 games from different developers, and for 90% of them, the new Steam Spy is within 10% margin of error. But I also saw some crazy outliers, where the difference between the estimates and the real data could be fivefold.

What’s with the new access rights?

You might’ve noticed that Steam Spy now displays the number of owners as a range and doesn’t display any graphs to the general audience. Patreon backers still have the access to more precise estimates and the graphs, like before.

I did that because I’m still not entirely happy with my new algorithm and its precision, and also because a lot of things on Steam Spy are still broken. I do believe in giving everyone the access to the essential information, but until I fix everything, Steam Spy will be semi-closed to the general audience.

What’s next?

I will keep on iterating the new algorithm while slowly bringing back the core functions of Steam Spy.

It will take some time and it’s still possible that Valve will make another move to shut down the service, but until that happens, Steam Spy will continue to operate.

--

--