Pushing MRD out from under the geek rock

The week before last (30th June – 1st July 2009), I was at the JISC Digital Content Conference having been asked to take part in one of their parallel sessions.

I thought I’d use the session to talk about something I’m increasingly interested in – the shifting of the message about machine readable data (think API’s, RSS, OpenSearch, Microformats, LinkedData, etc) from the world of geek to the world of non-geek.

My slides are here:

[slideshare id=1714963&doc=dontthinkwebsitesthinkdatafinal-090713100859-phpapp02]

Here’s where I’m at: I think that MRD (That’s Machine Readable Data – I couldn’t seem to find a better term..) is probably about as important as it gets. It underpins an entire approach to content which is flexible, powerful and open. It embodies notions of freely moving data, it encourages innovation and visualisation. It is also not nearly as hard as it appears – or doesn’t have to be.

In the world of the geek (that’s a world I dip into long enough to see the potential before heading back out here into the sun), the proponents of MRD are many and passionate. Find me a Web2.0 application without an API (or one “on the development road-map”) and I’ll find you a pretty unusual company.

These people don’t need preaching at. They’re there, lined up, building apps for Twitter (to the tune of 10x the traffic which visits twitter.com), developing a huge array of services and visualisations, graphs, maps, inputs and outputs.

The problem isn’t the geeks. The problem is that MRD needs to move beyond the realm of the geek and into the realm of the content owner, the budget holder, the strategist, for these technologies to become truly embedded. We need to have copyright holders and funders lined up at the start of the project, prepared for the fact that our content will be delivered through multiple access routes, across unspecified timespans and to unknown devices. We need our specifications to be focused on re-purposing, not on single-point delivery. We need solution providers delivering software with web API’s built in. We need to be prepared for a world in which no-one visits our websites any more, instead picking, choosing and mixing our content from externally syndicated channels.

In short, we now need the relevant people evangelising about the MRD approach.

Geeks have done this well so far, but now they need help. Try searching on “ROI for API’s” (or any combination thereof) and you’ll find almost nothing – very little evidence outlining how much API’s cost to implement, what cost savings you are likely to see from them; how they reduce content development time; few guidelines on how to deal with syndicated content copyright issues.

Partly, this knowledge gap is because many of the technologies we’re talking about are still quite young. But a lot of the problem is about the communication of technology, the divided worlds that Nick Poole (Collections Trust) speaks about. This was the core of my presentation: ten reasons why MRD is important, from the perspective of a non-geek (links go to relevant slides and examples in the slide deck):

  1. Content is still king
  2. Re-use is not just good, it’s essential
  3. “Wouldn’t it be great if…”: Life is easier when everyone can get at your data
  4. Content development is cheaper
  5. Things get more visual
  6. Take content to users, not users to content (“If you build it, they probably won’t come”)
  7. It doesn’t have to be hard
  8. You can’t hide your content
  9. We really is bigger and better than me
  10. Traffic

All this is is a starter for ten. Bigger, better and more informed people than me probably have another hundred reasons why MRD is a good idea. I think this knowledge may be there – we just need to surface and collect it so that more (of the right) people can benefit from these approaches.

The Brooklyn Museum API – Q&A with Shelley Bernstein and Paul Beaudoin

The concept and importance of museum-based API’s are notions that I’ve written about consistently (boringly, probably) both on this blog and elsewhere on the web. Programmatic and open access to data is – IMO – absolutely key to ensuring the long-term success of online collections.

Many conversations have been going on about how to make API’s happen over the last couple of years, and I think we’re finally seeing these conversations move away from niche groups of enthusiastic developers (eg. Mashed Museum ) into a more mainstream debate which also involves budget holders and strategists. These conversations have been aided by metrics from social media sites like Twitter which indicate that API access figures sometimes outstrip “normal web” browsing by a factor of 10 or more.

On March 4th 2009, Brooklyn Museum announced the launch of their API, the latest in a series of developments around their online collection. Brooklyn occupies a space which generates a fair amount of awe in museum web circles: Shelley Bernstein and team are always several steps in front of the curve – innovating rapidly, encouraging a “just do it” attitude, and most importantly, engaging wholly with a totally committed tribe of users. Many other museum try to do social media. Brooklyn lives social media.

So, as they say – without further ado – here’s Shelley and Paul talking about what they did, how they did it, and why.

Q: First and foremost, could you please introduce yourselves – what your main roles and responsibilities are and how you fit within the museum.

Shelley Bernstein, Chief of Technology. I manage the department that runs the Museum’s helpdesk, Network Administration, Website, gallery technology, and social media.

Paul Beaudoin, Programmer. I push data around on the back-end and build website features and internal tools.

Q: Can you explain in as non-technical language as possible what exactly the Brooklyn API is, and what it lets people do?

SB: It’s basically a way outside programmers can query our Collections data and create their own applications using it.

Q: Why did you decide to build an API? What are the main things you hope to achieve …and what about those age old “social web” problems like authority, value and so-on?

SB: First, practical… in the past we’d been asked to be a part of larger projects where institutions were trying to aggregate data across many collections (like d*hub). At the time, we couldn’t justify allocating the time to provide data sets which would become stale as fast as we could turn over the data. By developing the API, we can create this one thing that will work for many people so it no longer become a project every time we are asked to take part.

Second, community… the developer community is not one we’d worked with before. We’d recently had exposure to the indicommons community at the Flickr Commons and had seen developers like David Wilkinson do some great things with our data there. It’s been a very positive experience and one we wanted to carry forward into our Collection, not just the materials we are posting to The Commons.

Third, community+practical… I think we needed to recognize that ideas about our data can come from anywhere, and encourage outside partnerships. We should recognize that programmers from outside the organization will have skills and ideas that we don’t have internally and encourage everyone to use them with our data if they want to. When they do, we want to make sure we get them the credit they deserve by pointing our visitors to their sites so they get some exposure for their efforts.

Q: How have you built it? (Both from a technical and a project perspective: what platform, backend systems, relationship to collections management / website; also how long has it taken, and how have you run the project?)

PB: The API sits on top of our existing “OpenCollection” code (no relation to namesake at http://www.collectiveaccess.org) which we developed about a year ago. OpenCollection is a set of PHP classes sitting on top of a MySQL database, which contains all of the object data that’s been approved for Web.

All that data originates in our internal collections management systems and digital asset systems. SSIS scripts run nightly to identify approved data and images and push them to our FreeBSD servers for processing. We have several internal workflow tools that also contribute assets like labels, press releases, videos, podcasts, and custom-cropped thumbnails. A series of BASH and PHP scripts merge the data from the various sources and generate new derivatives as required (ImageMagick). Once compiled new collection database dumps and images are pushed out to the Web servers overnight. Everything is scheduled to run automatically so new data and images approved on Monday will be available in the wee hours Tuesday.

The API itself took about four weeks to build and document (documentation may have consumed the better part of that). But that seems like a misleading figure because so much of the API piggy-backs on our existing codebase. OpenCollection itself – and all of the data flow scripts that support it – took many months to build.

Cool diagrams. Every desk should have some.

Cool diagrams. Every desk should have some.

Q: How did you go about communicating the benefits of an API to internal stakeholders?

SB: Ha, well we used your hoard.it website as an example of what can happen if we don’t! The general discussion centered around how we can work with the community and develop a way people can can do this under our own terms, the alternative being that people are likely to do what they want anyway. We’d rather work with, than against. It also helped us immensely that an API had been released by DigitalNZ , so we had an example out there that we could follow.

Q: It’s obviously early days, but how much interest and take-up have you had? How much are you anticipating?

SB: We are not expecting a ton, but we’ve already seen a lot of creativity flowing which you can check out in our Application Gallery. We already know of a few things brewing that are really exciting. And Luke over at the Powerhouse is working on getting our data into d*hub already, so stay tuned.

Q: Can you give us some indication of the budget – at least ballpark, or as a % compared to your annual operating budget for the website?

SB: There was no budget specifically assigned to this project. We had an opening of time where we thought we could slot in the development and took it. Moving forward, we will make changes to the API and add features as time can be allocated, but it will often need to be secondary to other projects we need to accomplish.

Q: How are you dealing with rights issues?

SB: Anything that is under copyright is being delivered at a very small thumbnail size (100px wide on the longest size) for identification purposes only.

Q: What restrictions do you place on users when accessing, displaying and otherwise using your data?

SB: I’m not even going to attempt to summarize this one. Here’s the Terms of Service – everyone go get a good cup of coffee before settling down with it.

Q: You chose a particular approach (REST) to expose your collections. Could you talk a bit about the technical options you considered before coming to this solution, and why you preferred REST to these others?

PB: Actually it’s been pointed out that our API isn’t perfectly RESTful, so let me say first that, humbly, we consider our API REST-inspired at best. I’ve long been a fan of REST and tend to gravitate to it in principal. But when it comes down to it, development time and ease of use are the top concerns.

At the time the API was spec’ed we decided it was more important to build something that someone could jump right into than something meeting some aesthetic ideal. Of course those aren’t mutually exclusive goals if you have all the dev time in the world, but we don’t. So we thought about our users and looked to the APIs that seemed to be getting the most play (Flickr, DigiNZ, and many Google projects come to mind) and borrowed aspects we thought worked (api keys, mindful use of HTTP verbs, simple query parameters) and left out the things we thought were extraneous or personally inappropriate (complicated session management, multiple script gateways). The result is, I think, a lightweight API with very few rules and pretty accommodating responses. You don’t have to know what an XSD is to jump in.

Q: What advice would you give to other museums / institutions wanting to follow the API path?

SB: You mean other than “do it” <insert grin here>? No, really, if it’s right for the institution and their goals, they should consider it. Look to the DigitalNZ project and read this interview with their team (we did and it inspired us). Try and not stress over making it perfect first time out, just try and see what it yields…then adjust as you go along. Obviously, the more institutions that can open their data in this way, the richer the applications can become.

_______

Many, many thanks to Shelley and Paul for putting in the time to answer my questions. You can follow the development of the Brooklyn Museum collections and API over on their blog, or by following @brooklynmuseum on Twitter. More importantly, go build something cool 🙂

The person is the point

This is just going to be a quickie, mainly so I get it out before I go away on holiday never to remember it again. At some point I might expand on it.

Over the last few weeks in particular, we’ve seen the public finally sitting up and noticing Twitter. It’s been on the BBC, all over the news and makes for interesting watching on Google Trends, too:

Twitter / UK / 12 months

Twitter / UK / 12 months

About a year ago, my assessment of so-called “lifestreaming” was that it was all a timesink. Back then, I hadn’t pulled as deeply on the Twitter crack pipe as I have since, or do now. Looking back (nearly 5,000 tweets and 300 followers in), my thoughts are on the one hand changed – radically – and on the other, mostly the same.

My views have changed in terms of signal / noise ratio because Twitter has deeply, deeply affected me, the way I work and the way I consume and receive content and news. I can’t think of a technology that comes even close. The panic – and it is panic – that I feel when I consider a world without Twitter is, actually, pretty worrying.

On the other hand, my views about institutional Twitter have changed only a little. Back then, I questioned that Twitter has a place at all in an institutional setting. Now, with some water under the bridge, I’ve tuned my assessment of this. My current take on this is that there are only a few ways in which institutions can create convincing, fun, and followable Twitter streams.

The first of these is when it is automated (for example, Towerbridge – and this particular example is a genius use of various bits of technology). The second is at the opposite end of the spectrum, and that is when institutions are given personality, usually because the person doing the tweeting can sit outside the corporate MarketingFluff (TM). The obvious example is the always-great Brooklyn Museum. The third is when it is just plain useful, giving rapid updates on a topic in a way that other channels can’t.

As the interest grows, we’re starting to see the cultural sector increasingly wanting a slice of the pie, and the first thing they’re asking is how do we engage with this new channel? How do we mix it into our offering and make it work for us?

Right now, many of the museums on Twitter are using it in an informal, below-the-radar context. The problem is that as the thing goes more mainstream, we’re likely to see the same old problem we’ve seen with institutional blogging: it just ends up becoming the same old shit from marketing leaflets, regurgitated into new channels.

Twitter, like blogging, needs an edge, a voice, a riskiness. As long as institutions can retain this – i.e., do it for a reason – then, IMO, things will get more interesting. If they don’t, we’ll probably all be unfollowing museums as quickly as we can slide down the steep, slippery trough of disillusionment

Where the F have you been?

It’s been a long while (possibly the biggest gap since the launch of this blog..) since my last post – over a month.

This is unprecedented for me, and I’ve had four or five emails (thanks!) asking me why. I’ve always dodged around with an answer, not because I was trying to avoid some horrific truth but because until the last couple of days I simply haven’t had the brain time to devote to the reasons.

The first part of the answer to “Mike, where the F have you been?” is this: I’ve been busy keeping balls in the air: another presentation (What does Web 2.0 DO for us?) which I delivered to a roomful at Online Information 2008 on 4th Dec…the beginning stages writing a module for the new Digital Heritage MA/MSc at Leicester University – an opportunity which I’m hugely excited about, and not a little bit scared too…continuing work on three side-projects, none of which I can talk about just yet…development and writing for a corporate blog for internal comms…a desktop notification app…not to mention the hectic craziness of helping look after a 2-boy young family. Etcetefuckinra.

All of which is terribly boring, TBH, because if there’s one thing we all know about each other it is this: we’re all much too busy. In fact a corporate stat somewhere a while ago said that everyone believes themselves to be busier than 90% of everyone else. This is, of course, also true for me.

This leads to the second part of the answer: I’ve felt for a long time that the landscape of blogging has been changing considerably, particularly with lifestreaming now a part of our daily diet. I’ve blogged about noise on various occasions, and I’ve also noticed a huge shift in my own reading habits – a shift which has an obvious effect on my writing habits, too. I’m less interested in “blog post as news”, instead preferring longer, deeper, better written pieces like the beautifully-crafted Business Requirements Are Bullshit. I’m me – you’re you – but the important thing for me is that I write in a way which complements the medium and as much as possible brings some kind of value to those of you who have given up some of your valuable time to read what I have to say.

This brings me neatly on to the third part which was summed up in a conversation with Brian Kelly and Paul Walk over a post-work pint recently: why the F do we all blog, anyway? We were talking at the time about Paul’s much-commented post on blog awards. Paul is similar to me – and different to Brian – in that the former blogs as a hobby and not as a job. Paul runs his blog under his own name; Brian runs his (albeit not “officially”), under “UKWebFocus”. Brian has a series of blog policies and sticks closely to his particular topics; Paul could write about his washing powder if he so chose. I’ve always been clear (both to my readers and employers) that this isn’t a “work blog” – but it isn’t a “personal” one, either.

I started Electronic Museum as a way of reflecting on technology in the museum space. More than a year on and I’m interested in innovation, in technology ubiquity, in sharing data, in real people, in the value of attention data, in the user as focus. All of these call back to what makes museums unique, in my opinion, and it is in these arenas that I personally feel the battles for online content will be (or are being) fought and won. The point is it isn’t just a conversation about museums any more. And really, it never has been, in this always-on, radically-connected crazy internetwebthing we spend so much time staring at and talking about.

Much as I’ve carved a niche here with museum professionals who seem to value what I have to say, I’m also fascinated by the irony that nowadays it isn’t niche professionals that we need any more. Curators (museum and otherwise) – IMO – aren’t anything at all without the vision to see that what they know needs communicating in new, challenging ways; ways that may well undermine their professionalism purely because the social network they engage with has dug up someone who knows better than them. Content owners need to start to understand that value simply can’t be measured by “visits” when many people are out there having experiences with their content and not within the walled garden of their site. Technologists have got to stop hiding behind PEBCAC and start engaging with the people that are currently alienated by technology.

So what – exactly – am I saying?

I guess it is this: you’ll notice a shift over the coming weeks and months as I write about more of the things I’m doing outside of the museum space: my dabblings with the Arduino, for instance, the various other projects I’m continuously working on, a secretish partnership I’ll be able to talk about in January, and so on. I hope I won’t break the niche I’ve created – I hope that if you are a “museum professional” then you’ll continue to hang out here – I think what I have to say will be interesting, or at least mildly entertaining, whoever you are.

“we have a tech generation that thinks that’s all there is”

How to go about writing up a conference like Future of Web Apps? With, what, a thousand plus people converging on a space as large as London’s Excel centre, it’s not like you can be at every talk, breathe in every vibe, taste all the startups. I was even more crippled by the fact that I couldn’t make the first day. Nonetheless, here are some thoughts…

Mark Zuckerberg. Now with media training (TM)

Mark Zuckerberg. Now with Media Training (TM)

Conferences – in my experience anyway – aren’t usually about the sessions. They’re about the people, the schmooze, the drinking, the between bits. FOWA does these bits – big time. I had the headache to prove it. From that perspective, FOWA (and I believe I’ve – almost by accident – been to every one) is a winner. Big name (Zuckerberg, Rose, Arrington, Sierra..), big announcements, big…well, everything.

 

For this, Carsonified (and I’m slowly getting to know ’em – they’re Bath-based after all..) get massive quantities of respect. Ryan Carson is good at this shit: he knows it, the industry knows it, and it’s obviously a formula that works.

But..but..but..

I also think that conferences need a very strong sense of direction. It’s all too easy to revel in the hero-worship that surrounds people like Zuckerberg, and somehow forget that however much we might want to influence 100 million people with our web app, most of us aren’t there yet, and there’s a huge number of boxes to tick – technology, funding, usability, content, luck – before we’re going to even stand a chance of getting there. FOWA should be the place that, even if not actually answering these questions, goes about helping young developers begin to ask them: how can I get funding, what technology should I use, how can I create outstanding content, and so on. I’m not close to being a cutting edge developer, but every session on the developer track was so generic you could probably sum them up like this: “oAuth: it’s quite good”, “cloud computing: it’s quite good”, “work-life balance: it’s quite good”. To me, FOWA doesn’t come across as the future of web apps. It’s the near past of web apps. 

The challenge that Ryan et al. face is not an easy one: they’ve built a conference of big names, and with that comes a conference with a high level of sexiness and kudos. But what they haven’t done, IMO, is to build a conference with big ideas. This is increasingly going be a problem as – in the words of developers – FOWA attempts to scale into the future. As much as the bits-in-between make you feel warm inside about the whole tech scene, it’s a transient kind of warmth – as Simon Cowell said recently on XFactor (I know, hard to imagine someone as high-brow as me watching..): “it’s like eating water”. Without really challenging sessions, the socialising bit becomes really pretty vacuous. 

I don’t have the answers to this, but I have some thoughts:

Firstly, and most importantly – ideas. If we’re not at FOWA to exchange ideas, what exactly are we there for? At events like this – actually, at events like life – I’m looking for disruption, for new stuff, for insight, for difference. I’m not expecting academically rigorous research: I go to museum conferences for that – but newness should surely be a part of a conference all about the future, right? While some of the sessions delivered that (for me: Kathy Sierra on engaging users and Gavin Starks on green computing), for the most part this was very much a safe, formulaic place and not a bleeding-edge, forward-looking one. The business talks were leagues ahead of the developer ones, but even so there wasn’t enough challenging going on. Even Jason Calacanis, who pretty much makes a living from being offensive, didn’t manage to say much about life/work balance apart from “work hard, play hard”, which is hardly disruptive or original. Originality is often brave and sometimes dangerous, but I think this is the space that FOWA should be striving to be in.

Second: speakers need to be not just mediocre or good, but fucking great. I want entertaining, well-delivered, funny. Simon Wardley (I missed his session, but we shared a stage in Cardiff a couple of weeks ago) – is all of these. He rocks. He could talk shit and it’d still be great – as it happens, he talks with sense and conviction AND makes it funny too. Ditto, Kathy Sierra, who in my opinion did the best thing I’ve seen in some time: a funny, insightful, interactive session which really engaged as well as inspired. Many of the people presenting at FOWA just can’t do it. They might be great developers, but they can’t talk in public, and I’m sorry, but if you can’t do it, don’t do it. Or at least have a mind-blowing idea to cover up the fact you can’t talk about it 🙂

wakoopa. software without a reason, and bad spelling too...

wakoopa. software without a reason, and bad spelling too...

Finally: I think that all events like this can – and should – learn from people outside the specific sector. The tech scene should increasingly be listening to, and encouraging discourse with normal people. Ask yourself – where were the users at FOWA? It’s easy impressing a room full of developers with your new startup. It’s incredibly hard impressing a room full of people who have full, busy lives doing things other than geekery. It’s great having the funders and business guys there, but I also think it’d be really interesting to hear from people who struggle with technology – and endeavour to get some insight into what works for them. I’m personally 100% in support of Tim O’Reilly and his crusade to encourage tech that makes a difference rather than tech that scratches a transient, unimportant itch (and yes, Wakoopa, I’m afraid that’s you..).  I think it’s especially important to focus on this stuff in the current wave of uncertainty about our financial and environmental futures.

I hope this doesn’t seem an overly negative response to FOWA. It’s not meant to be – after all, I’ll be going again next year. This is a great event, and really the only one of its kind in the UK – but I also hope they learn to grow over time and mature the conference into something with a bit more weight – not serious, or academic, but perhaps finding ways to improve quality, Pirsig style

A barcamp in Bath? Bathcamp, obviously.

here? maybeNothing quite like leaping in and doing something before sorting out any of the details, but I’m hoping to organise a barcamp type moment in the Bath vicinity sometime during summer 2008.

I’m actually possibly the worst person to do anything with such enormous logistical overhead, but as long as I remain confident, calm and don’t tell anyone that I don’t know what I’m doing, then everything will turn out ok. I also have a very fine set of people who are up for helping, including my wife, who loves this sort of thing.

Anyway.

What: some kind of barcamp event;
Where:
Bath or very close nearby
When:
A Saturday (+night) sometime late summer 2008

If I’ve managed to pique your interest enough with this genius bit of marketing that tells you neither what, when nor where exactly then please head over to http://bathcamp.org and register your details. I’ve also added an entry to the barcamp wiki at http://barcamp.pbwiki.com/BarCampBath.

Once I’ve got some measure of the numbers I’ll start refining dates, venues and content – sorry! structure 🙂

Look forward to hearing from you!