The accidental digital archivist
I had other things I planned to do at the beginning of this year. Google had gone out of their way to push Google+ as a place for communities to come together, and communities in fact coalesced. I thought I had until summer to help save information that Google was about to delete. Then Google pulled the rug out from under us and moved the date from August to April.
This resurrected blog was the start of my journey.
I wrote a quick python hack to parse the awful HTMLish content of the JSON dump of my personal takeout and with some masssaging, created the content of this blog, and then pulled in other content I had written to have it in one place, under my own control.
Having learned that, when I saw Eric Lien ponder what was going to happen with the information that had been collected for years in the community building his HercuLien and Eustathios 3D printers, I wondered if it would be hard to use my script as the basis for a similar static site. I started from the Friends+Me Google+ Exporter tool’s JSON output, since Google didn’t bother to provide useful data to communities — and at that time, had not even suggested that they might provide something useful some day.
Once I had done that work, it seemed natural to amortize my work by archiving more maker communities, so I kept offering to help other maker community owners save their data from digital oblivion. I got to the point that the last one I did was less than 20 minutes of work total.
As I was doing that, I learned that Anthony Bolgar had stood up a Discourse instance, initially for the Google+ K40 laser cutter user community, and was inviting other makers to move their communities and discussions to it.
I started to wonder what it would take to import the data from Google+ into Discourse.
Two weeks ago, I read a 10-minutes Ruby introductory tutorial, found the Discourse source code, and dove into writing a conversion tool.
When I realized that enabling google authentication to Discourse would make it possible to identify content authors so that they could automatically maintain control over their own writing, I was immediately more motivated. Not merely preserving content, but also preserving ownership of content, seemed like a noble cause. So I made it work. And I tried to make the conversion look as much as possible as if people wrote the content in Discourse in the first place, so that they would be comfortable after the move.
Over the past few days, I have imported about a third of the content we have identified so far. Users can edit and otherwise control the moved content.
I’ve been happy to help. And more than a little annoyed that
Google has done so little here to respect G+ users’ trust.
In the weeks that remain, I’m open to additional suggestions
for maker-related content that would be appropriate to port to makerforums.
Update 10 March: To date, about 40,000 Google+ posts, with about 270,000 Google+ comments, have been imported into makerforums, preserving users’ ownership of their work.
Update 20 March: The project is essentially finished, with perhaps some few thousand straggling posts to come in the indefinite future, if we sort out the wheat from the chaff of piles of spam and astroturf posts and comments from a few particularly spammed communities. But I imported almost 50,000 Google+ posts, with over 350,000 comments, into makerforums.
The script has now been used by others for their own imports as well. That’s true success.
Update 28 March: A few more communities came in late (particularly Smoothie and BeagleBoard), so now we have imported over 50K Google+ posts into Discourse topics.