Guarding our data
The mass of personal information on government databases must be protected or public trust will be damaged, ministers are being warned. Information Commissioner Richard Thomas says getting details wrong or mixing them up has huge costs to the people concerned, government and businesses. Details should not be shared just because technology allows it …
Experts estimate that information about the average working adult in the UK is stored on 700 databases. They include information about people's health records, credit checks and household details. "Never before has the threat of intrusion to people's privacy been such a risk," said Mr Thomas. He said many databases were being used to good effect - such as systems for renewing car tax online rather than waiting in Post Office queues. But there can be problems, such as when the Criminal Records' Bureau mistakenly labelled thousands of people as criminals. …
There were severe consequences for people if information on (a) database was out-of-date, inaccurate, or given to the wrong people, he said. He pointed to the case of a father investigated by social services after his young daughter said he had "bonked" her - it turned out he had hit her on the head with an inflatable hammer. While social services had closed the file, police and health authority records were not updated and said the man had been suspected of child abuse.
Information Commissioner's Office; Annual Report, 2005–6 (pdf).
Technorati tags: trust, Information Commissioner
July 13, 2006 in Culture & Society, Current Affairs, Digital archives, Digital life, Identity, Politics & Society, Privacy | Bookmark This | Permalink | Comments (0) | TrackBack (0)
The BBC, Backstage … and what then?
Ben Metcalfe launched Backstage at Open Tech. His presentation can be downloaded here (PowerPoint) and an audio file can be downloaded via the Open Tech 05 site (the talk was one of those given in the Main Theatre). There's a very useful posting on Backstage about the raft of BBC News RSS feeds, including theme-led feeds.
What's happened to the conservative Auntie we grew up with? Earlier this year, Wired News carried a story entitled, 'The Beeb Shall Inherit the Earth' — by Cory: 'America's entertainment industry is committing slow, spectacular suicide, while one of Europe's biggest broadcasters -- the BBC -- is rushing headlong to the future, embracing innovation rather than fighting it. … With Backstage, BBC's online department takes all the goop in its content-management system -- breaking news, editorials and conferences -- and exposes it as a set of standard programming interfaces. Anyone who can hack a little Perl or Python can mix these into any kind of service they can imagine'. (Cory also sums up the BBC's developing relationship with amateur content providers, 'The BBC's news website is the first mainstream news-gathering organization in the Western world to solicit and give prominence to photographs and reporting provided by its visitors', and the Creative Archive — 'an attempt to digitize all the programming the BBC has commissioned, clear the copyrights and post it online with a Creative Commons-like license. This will allow Britons to download the BBC's content, distribute it and noncommercially remix it into their own films, music, gags, projects and school reports'.)
At Open Tech, I found myself sitting just across the aisle from Stef Magdalinski, author of Wikiproxy — the cause-of-origin of which was explained here (4 October, 2004):
News Online doesn’t engage with its users, it doesn’t provide tools that allow me, the licence payer, to slice and dice their stories, and by refusing to link from its body text, it fails to understand how hypertext works. Also, with its conservative link policy … that only connects the BBC to established brands, it snubs the wider web, the great teeming mass of creativity. Patrician is not authoritative. Aloof is not respected. Conservative and fearful is not engaging. The gap between the BBC’s utterly laudable self image and ambitions and delivery could not be any clearer than at News Online. Finally, by not really allowing user interaction or commenting, News Online forces that debate and activity away from its site, and out onto the wild wild web. I’ve known many people at the organisation since its very earliest days. There’s some incredible talent and ideas, and from what I hear, an equal amount of frustration at how difficult it is to get these ideas to fruition.
Wikiproxy is described by Stef Magdalinski as 'a proxy for the site, that does the following things: retrieves a page from News Online, and regexes out “Capitalised Phrases” and acronyms. It then tests these against a database of wikipedia topic titles. If the phrase is a topic in wikipedia, then it’s turned into a hyperlink; uses the technorati API to add a sidebar of links to blogs referencing the story. Now you can see who’s talking about the story from the story itself …'. And instead of suing him, the BBC went away and came back with Backstage.
Two of three Backstage-unleashed projects that Ben whipped through caught my attention (many more here):
- Google Maps, BBC Travel, Local News, Flickr, Weather & more...: 'a system that can take almost any geographic data and overlay it on a map. It currently displays BBC travel news, BBC London traffic cams, local weather, geotagged Flickr photos and UK Gatso speed camera'.
- BBC Homepage Archive: 'it checks for changes every minute, and also displays the differences in a nice format, so you can scroll back and see how the BBC homepage has changed over the course of a day'. (Compare the BBC Frontpage Archive.)
I had to leave before the session in the late afternoon when Lee spoke about Headshift's work for the BBC 'that looked at how social tagging might work on BBC News to drive both social bookmarking and user-driven related stories'. This project (which Lee spoke about at Reboot, last month) strikes me as really interesting; there's more on it here and here.
So there's Backstage, BBC Open Source, the Creative Archive Licence Group ('The BBC, the bfi, Channel 4 and the Open University set up the Creative Archive Licence Group to make their archive content available for download under the terms of the Creative Archive Licence - a single, shared user licence scheme for the downloading of moving images, audio and stills') and also Action Network: 'The BBC runs Action Network as an open forum for people to influence issues they care about. Most of the content is written by the public and reflects their views. Content provided by the BBC is clearly marked'. Other signs of movement and change at the Beeb keep popping up: Mildly Diverting posted last week that the BBC had authorised the opportunity 'to watch 'The Mighty Boosh' on broadband. A WEEK BEFORE IT GOES OUT ON THE TELLY', and Paul Mason (Newsnight) blogged Gleneagles and G8 from outside the BBC.
What does all this amount to? To stick with Backstage for a moment, it is clearly a GOOD THING:
This is such a good idea and will, I hope, cement the BBC's leading role in innovating for public good within the mainstream media. It is the latest in a long line of developments that illustrate how the BBC has become a safe harbour for some clever people who are committed to building public value through online media. It also proves, I think, how the internet has revitalised the BBC's public service remit, which was previously becoming a bit lost amidst the management debates, multi-skilling and the growing obsession with competing with lower forms of commercial media. (Lee)
But before we all get too excited, both Lee and Lloyd Shepherd, Head of Development at Guardian Unlimited, have some cautionary words. The BBC is moving boldly, but hasn't declared itself a liberty hall. Under Backstage's terms of service,
You can’t redistribute BBC content; only the BBC can do that. And Backstage is an ideal way to encourage distribution of BBC content around the world (a fundamental tenet of the BBC’s public service charter) but click on a link and you’re back on a BBC page to look at the full content. The simple fact is that the BBC is not distributing full-text content by RSS; only headlines and snippets (this is even true of Backstage’s own RSS feeds). As the BBC itself has said, it expects 10 per cent of its website traffic to be coming from RSS by the end of this year. In other words, RSS is just another effective way of building audience and traffic, and Backstage is a very good way of getting BBC RSS feeds out into wider communities. (Lloyd Shepherd)
Nevertheless, and in the spirit of that well-worn line from Robert Frost's 'Mending Wall', Something there is that doesn't love a wall, something stirs. Stef Magdalinski said how he was 'inspired by meeting Jimmy Wales of wikipedia.org', a venture and a vision that 'precisely illustrates how the collaborative, great unwashed web can create more value than ‘authoritative’ institutions', and it's great to see the BBC responding creatively, interpreting its remit in new ways for this new age. Definitely time not just to watch Auntie, then, but to join in.
July 26, 2005 in Collaboration, Communication, Creative Commons, Creativity, Culture & Society, Design, Digital archives, Digital life, Digital Rights, Internet, Media, News, Social Software, Television | Bookmark This | Permalink | Comments (0) | TrackBack (0)
'This vast, free and open system'
This caught my eye last week — Julian Bond posting about two stories:
Wayback Machine sued: DMCA
IFPI vs Heise vs Allofmp3
The first is about a law suit being brought in the USA where an old copy of a company's web site appears in the Wayback Machine. They are claiming copyright abuse using the much discredited DMCA. Crucially, they claim that old snapshots are available even though more recent snapshots have been prohibited via a robots.txt file that is being honoured. This is a problem that I've hit on Ecademy with Google where somebody has chosen to hide their profile from Google, but Google still maintains an entry in the index and a cached copy of the page from before they made the change.
The second is about a new law in Germany, where promoting a service which is illegal in Germany is also illegal. A German magazine website that specialises in copyright issues has a link in an article to Allofmp3, the Russian paid for music download site. They are being sued in Germany by the International Federation of the Phonographic Industry (IFPI). And this despite the fact that they have not yet brought a case *in Germany* proving that the Allofmp3 site is illegal under German law and within the German jurisdiction.
I looked into Allofmp3 in March of 2004, going so far as to ring The New Statesman to discuss their reporter's judgement that the site is legal. As Tom Armitage reported back then: 'All the music on the site is licensed by ROMS, the Russian Organisation for Multimedia and Digital Systems, and the assistant to the lawyer of ROMS assured music portal Museekster.com that: “the sites you mentioned conduct their business legally and are licensed by ROMS, in full accordance with Russian and international law“. … The site demonstrates a clear understanding of the internet and how best to exploit it, applying local copyright law to a global marketplace. Whether the record industry will be as impressed with it as the public remains to be seen.' Indeed.
I can't say Colin Greenwood was wowed when I told him about Allofmp3, either, but since then (November of last year) I think things have moved on and many musicians are reconsidering how to distribute their music, how the punter should pay for it and what rights the buyer should then have over his/her purchase. One of the reasons why Allofmp3 has attracted praise is because, as Tom Armitage pointed out, 'The files it provides have no digital rights management information attached to them. This means that there are no restrictions on how many times you copy or distribute the files once they’ve been downloaded. The files can be copied between an unlimited number of computers and electronic devices. It is still illegal to give the files to people who have not paid for them, but Allofmp3.com clearly feel they can trust their customers to keep the law, rather than potentially crippling the files they have paid for.'
And as for this nonsense about the Wayback Machine … I've long been a fan and user of it, but since the advent of Firefox and the wonderful extension from Kristof Polleunis that adds a right-click menu which 'allows you to check the page you are browsing or any link in the waybackmachine archive' — well, I use it many times a day and it's an indispensable tool. (There's another great extension there for Google cache, Gcache: 'It will add an entry called "Gcache This Page" to your contextmenu'.)
Julian winds up his post with a great comment and a terrific quotation from Doc Searls:
Like Doc Searls, I'm scared that this vast, free and open system will get tied down, monetized and ruined as more and more commercial and governmental interests try to control it.
This is what we are fighting, folks. The open and free marketplace the Internet provides is shortly going to look like the best darn mess of few-to-many distribution systems for "content" the world has ever known. It will not be the free and open marketplace it was in the first place, and should remain. The end-state will (be) a vast matrix of national and private silos and walled gardens, each a contained or filtered distribution environment. And most of us won't know what we missed, because it never quite happened.
July 21, 2005 in Commerce, Copyright, Culture & Society, Digital archives, Digital life, Digital Rights, History of Ideas, Internet, Music, Politics & Society, Search engines, Web/Tech | Bookmark This | Permalink | Comments (0) | TrackBack (0)
More thoughts about DEVONthink
In January, I posted some thoughts and links about GemX TexNotes Pro and DEVONthink. Prentiss Riddle has recently added his thoughts about the latter:
DEVONthink got a lot of attention recently when science writer Steven Johnson wrote an NYT piece about it and similar tools, crediting them with helping him come up with the ideas that go into his work. But in two subsequent blog posts he convinced me that his techniques are not generalizable. He had a research assistant to copy quotes and marginalia from his reading into DEVONthink, and he says directly that its success depended on the quality and granularity of what he saved: “most of the entries are in a sweet spot where length is concerned: between 50 and 500 words. If I had whole eBooks in there, instead of little clips of text, the tool would be useless". Since I need a tool to manage larger, still undigested documents (i.e., PDFs I haven’t read yet), it wouldn’t work its magic for me. Furthermore, DEVONthink only supports a single hierarchical organizational structure without tags or bibliographic metadata. So I’m still looking for a personal library application.
March 8, 2005 in Content Management, Creativity, Digital archives, Digital life, Good Writing, Knowledge Management | Bookmark This | Permalink | Comments (0) | TrackBack (0)
Jamming with your computer
AKAV put me on to GemX TexNotes Pro:
I managed to find a tool supporting a highly stochastic writing process - by keeping track of all my random thoughts. It's highly interlinkable, easy to use and runs smooth so far. … The only thing I miss is integration with a Bib-tex database.
I've just started playing with TexNotes (Windows-only) and so far it looks very good. Like AKAV, I need something that can work with me as I jot down scattered thoughts, quotations and ideas that I know are interlinked and amount to a post, an article or a book.
DEVONthink, a Mac-only program, is also very interesting but seems to go way beyond what TexNotes can do (amongst other things, it's a freeform database). On his blog, Steve Johnson explains a great working relationship he has evolved with this program, and in the NYT he suggests,
… 2005 may be the year when tools for thought become a reality for people who manipulate words for a living, thanks to the release of nearly a dozen new programs all aiming to do for your personal information what Google has done for the Internet. These programs all work in slightly different ways, but they share two remarkable properties: the ability to interpret the meaning of text documents; and the ability to filter through thousands of documents in the time it takes to have a sip of coffee. Put those two elements together and you have a tool that will have as significant an impact on the way writers work as the original word processors did. … These tools are smart enough to get around the classic search engine failing of excessive specificity: searching for ''dog'' and missing all the articles that have only ''canine'' in them. Modern indexing software learns associations between individual words, by tracking the frequency with which words appear near each other.
And this, from his blog, about his 'digital research library': 'When you're freewheeling through ideas that you yourself have collated -- particularly when you'd long ago forgotten about them -- there's something about the experience that seems uncannily like freewheeling through the corridors of your own memory. It feels like thinking.' And a tantalising prospect: 'The other thing that would be fascinating would be to open up these personal libraries to the external world. That would be a lovely combination of old-fashioned book-based wisdom, advanced semantic search technology, and the personality-driven filters that we've come to enjoy in the blogosphere.'
Cory has a fine, general comment on Steve Johnson's use of DEVONthink:
… his computer jams with him, suggesting neat tangents to his subjects. It's a great example of good computer-human interaction, where computers are used to programatically count and compare quantifiable elements (word and phrase frequencies) and human beings are used to pass judgement on the output of the computers. People are good at understanding and crap at counting; computers are just the reverse.
January 30, 2005 in Content Management, Creativity, Digital archives, Digital life, Good Writing, Knowledge Management | Bookmark This | Permalink | Comments (0) | TrackBack (0)
Wikipedia, Google and open access to knowledge
Is Truth the first victim?
The user who visits Wikipedia to learn about some subject, to confirm some matter of fact, is rather in the position of a visitor to a public restroom. It may be obviously dirty, so that he knows to exercise great care, or it may seem fairly clean, so that he may be lulled into a false sense of security. What he certainly does not know is who has used the facilities before him.
Robert McHenry is Former Editor in Chief, Encyclopædia Britannica
NYT:
To the Editor:
Re ''Google Is Adding Major Libraries to Its Database'' (front page, Dec. 14)
While having online access to some great libraries promises to facilitate research in democratizing access to books, it is worth keeping some things in mind. A digital version of a book -- especially a rare one, printed centuries ago -- is not a replacement for the hard copy. Not only has printed paper proved a durable technology, but there is also much to be gained by visiting the libraries, examining the actual books and entering into discussions with librarians and other researchers. Gaining access to a digital reproduction of an older text makes it easier to take a first step, but little good research will be done simply sitting alone in front of a computer screen.
Lisa Shapiro
Vancouver, British Columbia
Dec. 14, 2004The writer is an assistant professor of philosophy at Simon Fraser University
The gatekeepers are enraged, a priesthood agitated once again, and it's an easy spectacle to enjoy. But there are difficult issues, too. Larry Sanger (link via Many2Many), formerly of Wikipedia and its co-founder:
… the following must be taken in the spirit of someone who knows and supports the mission and broad policy outlines of Wikipedia very well. First problem: lack of public perception of credibility, particularly in areas of detail. … regardless of whether Wikipedia actually is more or less reliable than the average encyclopedia, it is not perceived as adequately reliable by many librarians, teachers, and academics. The reason for this is not far to seek: those librarians etc. note that anybody can contribute and that there are no traditional review processes. … there are a great many benefits that accrue from robust credibility to the public. One benefit, but only one, is support and participation by academia. Second problem: the dominance of difficult people, trolls, and their enablers. … A few of the project's participants can be, not to put a nice word on it, pretty nasty. And this is tolerated. So, for any person who can and wants to work politely with well-meaning, rational, reasonably well-informed people--which is to say, to be sure, most people working on Wikipedia--the constant fighting can be so off-putting as to drive them away from the project. The root problem: anti-elitism, or lack of respect for expertise. There is a deeper problem--or I, at least, regard it as a problem--which explains both of the above-elaborated problems. Namely, as a community, Wikipedia lacks the habit or tradition of respect for expertise. As a community, far from being elitist (which would, in this context, mean excluding the unwashed masses), it is anti-elitist (which, in this context, means that expertise is not accorded any special respect, and snubs and disrespect of expertise is tolerated).
Larry Sanger is now 'on the academic job market'. I can't believe he hasn't discovered for himself how trolls and difficult people are quite fully enough represented in academia — a glance almost any week at the Letters pages of the TLS, LRB, etc will make that clear. Wikipedia has no monopoly in that market.
Clay Shirky takes each of Sanger's points and deals with them fairly but firmly. He says:
Of course librarians, teachers, and academics don’t like the Wikipedia. It works without privilege, which is inimical to the way those professions operate. This is not some easily fixed cosmetic flaw, it is the Wikipedia’s driving force. … The physical book, the hushed tones, the monastic dedication, and (unspoken) the barriers to use, these are all essential characteristics of the academy today. It’s not that it doesn’t matter what academics think of the Wikipedia — it would obviously be better to have as many smart people using it as possible. The problem is that the only thing that would make the academics happy would be to shoehorn it into the kind of filter, then publish model that is broken, and would make the Wikipedia broken as well. …
(Wikipedia) is valuable as a site of argumentation and as a near-real-time reference, functions a traditional encyclopedia isn’t even capable of. (Where, for example, is Brittanica’s reference to the Indian Ocean tsunami?) The Wikipedia is an experiment in social openness, and it will stand or fall with the ability to manage that experiment. Whining like Sanger’s really only merits one answer: the Wikipedia makes no claim to expertise or authority other than use-value, and if you want to vote against it, don’t use it. Everyone else will make the same choice for themselves, and the aggregate decisions of the population will determine the outcome of the project. And 5 years from now, when the Wikipedia is essential infrastructure, we’ll hardly remember what the fuss was about.
The best thing on this vexed question of authority that I've read in this whole debate is from Collin Brooke (I've cited it here before):
... credibility is something you earn and develop, not something you simply have. When we ask our students to do research and to prepare the results in written form, we are teaching them to earn credibility through breadth and depth of research. You don't earn credibility by citing an "authoritative source," whatever that means. You earn it by testing your sources against one another, understanding what the reasons are for differences of opinion, and figuring out how to resolve them or to choose among positions, etc. In other words, authority should be something that each of us assigns to our sources, not the other way around. It is the result of research, not a prerequisite.
Which goes to support Danah Boyd's view (and she writes as a contributor to and user of Wikipedia):
i do not consider it to be equivalent to an encyclopedia. I believe that it lacks the necessary research and precision. The lack of talent and practice mostly comes from the fact that most entries have limited contributers. Wikipedia is often my first source, but never my last, particularly in contexts where i need to be certain of my facts. Wikipedia is exceptionally valuable to read about multiple sides to a story, particularly in historical contexts, but i don't trust alternative histories any more than i trust privileged ones. … I don't believe that the goal should be 'acceptance' so much as recognition of what Wikipedia is and what it is not. It will *never* be an encyclopedia, but it will contain extensive knowledge that is quite valuable for different purposes.
(See also Slashdot.)
January 4, 2005 in Collaboration, Digital archives, Education, Emergent Intelligence, Reference, Wiki | Bookmark This | Permalink | Comments (0) | TrackBack (4)
On remembering
Googlization: the embedding of personal collections in global networks. In commercial visions like Microsoft's MyLifeBits priority is often given to the image of a jukebox of personal memory artifacts. My guess is blogs, on the other hand, would emphasize the inherent connectedness of individual memory to a constantly evolving social context.
Capturing technologies shape the very nature of remembering as they become intertwined in our daily routines of our self-creation.
January 2, 2005 in Culture & Society, Digital archives, Moblogging, Photography, Social Software, Weblogs | Bookmark This | Permalink | Comments (2) | TrackBack (0)
Google and the Universities
From John Battelle, these excerpts from the text of a note which was sent to certain parties at Harvard today:
Harvard University is embarking on a collaboration with Google that could harness Google's search technology to provide to both the Harvard community and the larger public a revolutionary new information location tool to find materials available in libraries. In the coming months, Google will collaborate with Harvard's libraries on a pilot project to digitize a substantial number of the 15 million volumes held in the University's extensive library system. Google will provide online access to the full text of those works that are in the public domain. In related agreements, Google will launch similar projects with Oxford, Stanford, the University of Michigan, and the New York Public Library.
The Harvard pilot will provide the information and experience on which the University can base a decision to launch a large-scale digitization program. Any such decision will reflect the fact that Harvard's library holdings are among the University's core assets, that the magnitude of those holdings is unique among university libraries anywhere in the world, and that the stewardship of these holdings is of paramount importance. If the pilot is deemed successful, Harvard will explore a long-term program with Google through which the vast majority of the University's library books would be digitized and included in Google's searchable database. Google will bear the direct costs of digitization in the pilot project.
December 14, 2004 in Collaboration, Culture & Society, Digital archives, Education | Bookmark This | Permalink | Comments (0) | TrackBack (0)
Flickr
Flickr continues to impress me. It has to be one of the most innovative and exciting sites on the web. The new, very fast Organizr is a pleasure to use, but so much innovation is coming on-line that it's not always being trumpeted as it arrives. Last night, whilst reading the Flickr forums, I discovered that you can now give your photos the Creative Commons stamp:
'You can choose to use a Creative Commons license to allow more liberal use and sharing of your photos while still maintaining reasonable copyright protection' (here)'You can also batch-select a license for all previously uploaded photos' (here)
NB: you need to be logged-in with a free Flickr account to view these pages.
September 12, 2004 in Creativity, Digital archives, Digital Rights, Moblogging, Photography, Social Software, Software, Web/Tech | Bookmark This | Permalink | Comments (0) | TrackBack (0)
Preserving digital data
This is now a critical issue in our society, as The Independent reported at the weekend (see here). In 'Thirteen Ways of Looking at ... Digital Preservation', Lavoie and Dempsey look at some of the issues involved and conclude:
Preserving our digital heritage is more than just a technical process of perpetuating digital signals over long periods of time. It is also a social and cultural process, in the sense of selecting what materials should be preserved, and in what form; it is an economic process, in the sense of matching limited means with ambitious objectives; it is a legal process, in the sense of defining what rights and privileges are needed to support maintenance of a permanent scholarly and cultural record. It is a question of responsibilities and incentives, and of articulating and organizing new forms of curatorial practice. And perhaps most importantly, it is an ongoing, long-term commitment, often shared, and cooperatively met, by many stakeholders.As experience in managing the long-term stewardship of digital materials accumulates, there will likely be even more ways we will need to look at digital preservation in the course of building digital information environments that endure over time. But this should come as no surprise: after all, Wallace Stevens found at least thirteen ways of looking at a blackbird.
July 27, 2004 in Culture & Society, Digital archives | Bookmark This | Permalink | Comments (0) | TrackBack (0)
Vanishing digital texts
Books, scientific journals and films which are published only on the internet could be lost unless a system to store the digital material in permanent form is put in place, said Lynne Brindley the chief executive of the British Library. ...The IndependentThe BBC's £2.5m Domesday Project, a snapshot of Britain in 1986, is a prime example of the difficulties looming on the horizon if no universal, permanent method of digital storage is found and used. This project stored information and photographs on video discs but within a few years the system used to play them had become obsolete and the discs were rendered virtually useless. Academics have since had to develop software that will emulate the original BBC computer system to ensure its continued accessibility.
Legislation passed last year enshrined the principle that electronic and non-print publications should be deposited at the British Library in the same way as books, pamphlets, maps, printed music, journals and newspapers. However the changes to the Legal Deposit Act did not address the problem of the added costs entailed.
July 26, 2004 in Culture & Society, Digital archives | Bookmark This | Permalink | Comments (0) | TrackBack (0)
