Anthology of interest 2

Interesting things I’ve been reading/watching/listening to recently:

📖 What we’ve learned from #ExpertDebate – Wellcome’s Director Jeremy Farrar reflects on the issues that mistrust of expertise raises for researchers.  He warns us to beware double meanings when talking science: these 5 words have different meanings in the context of science vs everyday usage: theory, significant, risk, determine, predict.

📽 The closing keynote of UKSG Conference 2017: Post-Truth: the role of publishers and librarians by Charlotte Roueché, Professor Emeritus of Digital Hellenic Studies at King’s College London (and here is her ORCID record).  In a post-truth era where fake news proliferates, she asks us “What are you doing to preserve truth and honest thinking?”  We are all researchers, and it’s not good enough to say it’s not our problem.  She emphasises the importance of linked open data, and highlights 5-star Open Data – how to make your data open, re-usable, and linked (includes examples, and costs/benefits for each level).

👂 Copyright or Wrong – Leading copyright lawyer and author Richard Taylor asks whether copyright is an analogue law in the digital age.  Featuring German MEP Julia Reda, and Monty Python.

📖 Worth Dying For: The Power and Politics of Flags by Tim Marshall.  I learned a lot from Tim’s previous book, Prisoners of Geography: Ten Maps That Tell You Everything You Need to Know About Global Politics, so I had high expectations!  I devoured Worth Dying For and will soon re-read it to make sure I can remember as many as possible of Tim’s observations and anecdotes.  He has a dry sense of humour too!

📽  … While awaiting the arrival of the copy you’ve ordered, watch this video of Tim speaking about The Power and Politics of Flags (public lecture at LSE).

👂 I’m really enjoying the current BBC World Service series 50 Things That Made The Modern Economy.  Wondering how the development of air-conditioning might be linked to Ronald Reagan winning the 1980 US Presidential Election?  Listen to the episode on air-conditioning and then catch up with all the others.   Each episode is only 9 minutes long, and all of them will give you something to think about.

📖 How We Got “Please” and “Thank You” – Why the line between politeness and bossiness is a linguistic mirage (and the idea of the tacit calculus of debt).

On which matter, thanks for reading 🙂


Copyright fight: Authors Guild v. HathiTrust

Disclaimer: I am not a lawyer! This summary is written in good faith and any errors are my own (let me know and I’ll correct ’em).  Carry on…

HathiTrust is a collaborative project between a number of university libraries and other institutions to establish a repository to digitise (archive) and share access to their collections.

The HathiTrust collection includes both public domain and in-copyright content from a variety of sources, including Google, the Internet Archive, Microsoft, and partner institution projects.

Public domain content from HathiTrust is publicly accessible, and in-copyright content is accessible to authenticated users.

The main aims of digitisation projects like HathiTrust include ensuring long-term preservation of the materials (waiting until the works pass into the public domain often means the opportunity for scanning them in good condition has passed); making the content of books and journals more discoverable; opening up library content to students and others with print disabilities; and ensuring the continued relevance of the book culture in an increasingly digital age (list taken from the Committee on Institutional Co-operation, a HathiTrust partner).

The Authors Guild Lawsuit

(surely that should be Authors’ Guild? #pedant)

In September 2011, the Authors Guild, the Australian Society of Authors, the Union Des Écrivaines et des Écrivains Québécois (UNEQ), and eight individual authors filed a lawsuit against HathiTrust and a number of American universities, citing gross copyright violation.

In October 2012, a federal court ruled against the Authors Guild, finding that HathiTrust’s use of books scanned by Google was fair use under US law.

The HathiTrust repository contains over 10.5 million scanned books, most of which were created as part of the Google Books project.  Of these, about 31% are in the public domain, meaning that the remaining 69% are still in copyright.

Some of the main issues in the case were:

  1. The storage of scanned book images and text files for preservation purposes
  2. Indexing the full-text of the files for search purposes (though the search results show only where search terms appear in the catalogued items and do not allow the full text of the item to be read)
  3. Format-shifting to make works accessible to users with disabilities (e.g. creating a digital copy of a work that can be read by a person with visual impairment using screen reading software, even if digitising the work for other reasons is not permitted)

Main outcomes of the case for information professionals

Sections 107 and 108 of the US Copyright Act

Section 108 of the US Copyright Act allows libraries to make copies (within limits) for preservation and research.  It includes an explicit statement preserving the application of fair use:

Nothing in this section . . . in any way affects the right of fair use as provided by section 107.

The copyright owners argued that because one specific statute (108) applies to libraries, the general statute on fair use (107) cannot apply.  The court ruled that libraries may apply Section 108 and Section 107 on fair use: section 108 on library privileges doesn’t limit the scope of fair use (section 107).

Search indexing

Although the defendants argued that creating copies for preservation is “transformative,” the court did not agree.

Maintaining text files for searching is a transformative use, because the copies serve an entirely different purpose from the original works, but as the files were only for search and not for full-text access, no copyrighted content was accessible.

Search indexing is a transformative use, and it is a fair use.


American educational institution are mandated to serve needs under the Americans With Disabilities Act.

Section 121 of the US Copyright Act permits an “authorized entity” to make formats of certain works available to persons who are visually impaired.  An “authorized entity” is one that has a “primary mission” to serve those needs.  The court decided that although libraries and universities have many functions, they do have a “primary mission” to serve those needs.

There is no conflict of interest with commercial use, as there is no market for scanning and making materials available to people who are print-disabled, nor is one likely to develop.

Access for people who are print-disabled is a transformative use, and it is a fair use.

Commercial use

The court decided that the HathiTrust partner libraries weren’t making materials available for commercial use, even though they partnered with Google to carry out the scanning.

This is important for UK copyright holders whose works in US libraries have been digitised via Google Books or similar projects.

See this summary from Columbia University on “Effect on the Market for the Works”:

  • For noncommercial uses, the plaintiff must show “by a preponderance of the evidence that some meaningful likelihood of future harm exists.”
  • The court rejected the argument of lost sales, finding that sales of books would have not served text searches or access for persons who are print disabled.
  • The court found that the copies in HathiTrust were not a security risk, noting the evidence presented about the security measures in place.
  • The court also found assertions of future licensing revenue to be “conjecture” without evidence of some actual harm.
  • In broad terms, the court also ruled that copyright owners “cannot preempt a transformative market” and uses that are in a “transformative market” do not cause a loss of license revnue.
  • The projected high cost of any possible license market would also be cost prohibitive for an initiative such as HathiTrust, and it may not be possible at all given the numerous works and the need to locate copyright owners.
  • Regarding the needs of the print-disabled, the evidence showed that they are a “tiny minority” and a market to allow them access to millions of books “is consequently almost impossible to fathom.”

Digitisation projects such as those carried out by HathiTrust and its partner universities are non-commercial.

Other reports and opinions on the case

United States District Court, Southern District of New York: The Authors Guild, Inc., et al., against Hathitrust, et al. 11 CV 6351 (HB) Opinion & Order

The Chronicle of Higher Education – Judge Hands HathiTrust Digital Repository a Win in Fair-Use Case

Columbia University Libraries/Information Services Copyright Advisory Service –  Court Rules on HathiTrust and Fair Use

Copyright Librarian – Author’s Guild v Hathi Trust: A Win for Copyright’s Public Interest Purpose

HathiTrust – Information about the Authors Guild Lawsuit

The Michigan Daily – ‘U’ wins copyright lawsuit against Hathitrust digitalization project

Wired – Judge Says Fair Use Protects Universities in Book-Scanning Project

Essential law for information professionals

I’ve recently enjoyed reading Paul Pedley’s book, Essential law for information professionals (3rd edition).  Best read a chapter at a time, it gives a practical introduction to the many areas of law you may encounter in your work in an information context.  I particularly liked how he used examples of real cases to illustrate how library staff have become embroiled in legal action and what the outcomes and learning points were.

Here are some of my gleanings:

  • Do you know the difference between R and TM? ® is a registered trade mark and ™ is an unregistered trade mark
  • You can search the Data Protection public register here
  • An escrow agreement is recommended if using cloud computing services – it is an agreement to require the service provider to deposit their source code and related materials with a neutral third party.  If release conditions are triggered (e.g. service provider goes into administration) the customer can access the application, their own proprietary data and intellectual property which supports the software as a service solution [SaaS].
  • And finally, a separate post about e-resource licences

When preparing notes for this post, I was worried that I might be infringing copyright law (how ironic) but decided in the end that since I have given full attribution and that I have only referred to a few short sections of the book (less than a chapter or 5%), it would probably be ok.

E-Resources FAQ

This is a collection of things I wish everyone knew about e-resources.  Whether this area is new to you or not, I hope you find something useful here; and do let me know about any points I’ve missed in the comments.

What are e-resources?

E-resources are also known as electronic resources and there are two main types: e-journals (or electronic journals) and databases.

Many e-journals are digital copies of print journal articles, but increasingly e-journal articles are published without a print analogue.

There are several kinds of databases

  • Bibliographic – this type of database is a collection of references to published literature.  It functions in a similar way to a library catalogue, but indexes details of articles rather than books
  • A&I (abstracting and indexing) – in addition to bibliographic details, this type of database also contains abstracts of the individual articles
  • Full text – a database which includes the full text of all the articles it has indexed
  • Data/statistics – a collection of numbers and facts which you can query in order to extract a particular dataset.  A database in the purest sense of the word.
  • Images – a database containing a searchable index of images and the images themselves

What does full text mean?  Full text refers to an e-resources that makes available online the whole contents of journal articles, not just the abstract or citation.  Full text articles are often subscription resources, requiring an individual or institutional account for access.

What is an abstract?  An abstract is a summary of a journal article, often published at the beginning of the article.

What is a platform? A platform is a website which hosts content or programs.  Examples include JSTOR and ISI Web of Knowledge (which hosts a number of databases including, confusingly, Web of Science).

What is SFX?  SFX is an OpenURL link resolver, which works by compiling a list of all the journals to which an institution (such as a university) is subscribed and linking to that content.  Primarily, it functions to allow you to search an institution’s subscriptions to see if you can access a particular e-journal, and which years are included in the subscription.  At Oxford University, SFX is locally branded as OU eJournals and is one of a number of resources whose contents are searchable via SOLO.

What is MetaLib?  MetaLib is a search system which allows you to search for resources, link to them, and (in some cases) search within them.  This is not possible for all resources, as they need to be compliant with a protocol called Z39.50 in order to be searchable.  At Oxford University, MetaLib is locally branded as OxLIP+ and is one of a number of resources whose contents are searchable via SOLO.

What is a paywall?  A paywall is a barrier to a website which requires you to authenticate to view the content.  Usually, this requires a paid subscription.  An important implication of this is that any content behind a paywall is not indexable by search engines and therefore will not appear in the search results.  Not everything on the Internet is known to Google.

There are several methods of authentication

Internet Protocol (IP) – the IP address of your computer identifies where you are in the world, and is also used by sites like BBC iPlayer which use your IP address to check which country you are in.  If you are using the university’s computing facilities on campus, the computer you’re using will have an IP address within the university’s main range, which is detected by the e-resource you are trying to reach and access will be granted.  Working “off-campus” means that you are off the university network, perhaps using your own laptop in a university library or working from your own home.  This means that your computer’s IP address is not within the institution’s IP range and you will need a different method of access.  VPN software is commonly used to solve this issue and it works by extending the institution’s network to your computer, thereby bringing it into its IP range.

Want to find out your IP address?  Just go to

Single sign-on (SSO) – logging in via SSO identifies you as a member of an institution (such as a university) and therefore allows you access.  A great advantage of SSO login is that your authentication can be pushed from one site to another via your browser, so you don’t have to keep logging in when you go to a different subscription site that accepts SSO authentication.

Username and password – the old school method.  Nowadays, this only really applies to a small number of really expensive resources, where tight budgets or low demand mean that a several-user subscription than whole-campus access has been purchased.  There may only be (for example) 5 usernames and passwords for the resource, and if all 5 are in use, you will need to wait until someone has logged out so that you can use that ID to log in afresh.

Also good to know

What is a session identifier?  Session IDs or tokens are commonly used in online shopping sites and data/statistics databases.  These types of sites combine a variety of information to produce the page you are viewing, rather than retrieving a pre-prepared HTML page.  The session ID is used to track the individual user’s actions during the course of their session on the site.  Your shopping cart contents or dataset only exists because you have selected and combined certain elements during the session, which will time out after an order is finalised, or the user logs out, or after a period of inactivity.

URLs which contain “session” or “sid” indicate a session ID, and are not persistent.  If you are attempting to link to a resource, check the URL: if it contains a session ID, the URL will not work when someone tries to follow it later on because the session will have timed out.

Some e-resources have embargoes which are periods during which access is not allowed (usually to protect the publishers’ interests, or in JSTOR’s words “protect the economic sustainability of our content providers”).  There are several types of embargo:

  • A rolling or moving wall – a fixed period of months or years.   For example, most journals in JSTOR have an embargo of 3 or 5 years, and as a new issue is published, its equivalent from 3 or 5 years before will become available on JSTOR.
  • An annual cycle – for example, all content before 1st January of this year is available.  This will add another year to the archive on 1st January of each year
  • A fixed date – for example, only content before 2005 is available

If you’re carrying out research in your subject area, make sure you don’t rely exclusively on resources with embargoes, as you will be missing current and recent material.

E-resources and copyright – keep your use legal!

Most e-resources publishers have a ‘fair dealing’ arrangement which allows you to print or save one article per journal issue.  Downloading an article happens when you view the article on screen, not just if you save it.  Please be aware that systematic downloading is not permitted under fair dealing arrangements and may compromise your institution’s access to the resource.  Also, remember that your access to e-resources is for your own research and learning only, and you may not email pdfs or other downloaded documents to anyone outside your institution.

See also: E-Resources – less frequently asked questions for the next part of the story…

Tips for using Oxford libraries

The 100+ libraries of the University of Oxford provide a comprehensive library service for the University. The libraries are grouped into three categories:

  • Bodleian Libraries (including the Bodleian Library)
  • Faculty and Departmental libraries
  • College libraries

Each library has its own rules, opening hours and lending practices.  This guide will give you a brief overview.  For specific enquiries, please read the relevant library’s homepage (where you will also find their contact details).

1. Which libraries may I use?

If you are a member of the University, you may usually access your own College library, your departmental/faculty library and the Bodleian Library.  Your registration at the Bodleian Library is normally automated, but you may need to register when you first visit one of the other libraries.

As well as providing resources for learning and research in different subject areas, different libraries also have a variety of study spaces which you can explore and find out where you work best.

2. Where are the libraries?

Please refer to this map of all the libraries of the University of Oxford

3. When are my libraries open and how can I contact them?

Start with this list of all the libraries at the University of Oxford which links to information about them and their websites (where applicable).  Please follow these links to find the opening hours and contact details for each library.

4. What if I have a disability?

The Bodleian Libraries have compiled information about library access and accessibility resources such as assistive technology for visitors with a disability.  Other libraries may have such information on their websites (please see 3 above) or you may contact them directly to enquire.

5. How do I find library material?

SOLO is the search interface to the library catalogues used by most of the libraries at the University.  Please refer to this guide to SOLO for help with searching the interface.

6. How many books may I borrow?

It depends on the individual library, and some libraries at Oxford are reference only and do not allow anyone to borrow from them.  You can see all the items you have on loan from libraries which use SOLO via the ‘My Account’ option on SOLO.  Help is also available in the guide to SOLO, under the “Renewals & your account” tab.

Whether and how you are allowed to renew an item will depend on the individual library. If online renewals are allowed you can carry these out whenever you see the option to login to ‘your account’ on SOLO.

7. What about fines?

All Oxford libraries set their own fines and fine rules. The best way to avoid them is to make sure you know the rules and get your books back on time!  Many libraries will send you automatic reminders via email.

8. How do I photocopy?

All libraries will have different procedures for photocopying .The Bodleian Libraries have a system called PCAS and here is a guide to the PCAS copying system.

Don’t forget that it is your responsibility to make sure you stay within the law when making copies.  Please see the Bodleian Libraries’ copyright FAQ for further information.

9. How do I find e-resources?

E-resources can be found by searching on SOLO.  They are also listed on OxLIP+ (for databases) and OU eJournals (for e-journals).

10. How do I get connected to the internet?

Access to the University’s wireless networks (OWL and eduroam) is available in many of the libraries in the University.  Here is a list of Bodleian Libraries reading rooms with wireless access.



Digital Opportunity: A Review of Intellectual Property and Growth

Digital Opportunity, a review of Britain’s intellectual property law by Professor Ian Hargreaves, is published this month. The report concludes that the UK’s copyright laws are outdated and makes recommendations for a “clear change in in the strategic direction of IP [intellectual property] policy direction designed to ensure that the UK has an IP framework best suited to supporting innovation and promoting economic growth in the digital age. This change is modest in ambition and wholly achievable.”

Here is an extract from the executive summary [with some of my comments]:

The Review’s specific recommendations would support growth of the UK’s increasingly intangibles intensive economy. This requires:
• an efficient digital copyright licensing system, where nothing is unusable because the rights owner cannot be found [no more orphan works – hurrah!];
• an approach to exceptions in copyright which encourages successful new digital technology businesses both within and beyond the creative industries;
• a patent system capable of preventing heavy demand for patents causing serious barriers to market entry in critical technologies;
• reliable and affordable advice for smaller companies, to enable them to thrive in the IP intensive parts of the UK economy;
• refreshed institutional governance of the UK’s IP system which enables it to adapt organically to change in technology and markets.
If the Review’s recommendations are acted upon, the result will be stronger rates of innovation and increased economic growth. An economic impact assessment conducted by the Review team, and of course subject to the high degree of uncertainty inherent in such projections, estimates that this would add between 0.3 per cent and 0.6 per cent to annual GDP growth. The path laid down in this review would also, over time, mean that IP law, including copyright law, would become clearer and be observed by most people without controversy [the Report notes that millions of citizens are in daily breach of copyright for format-shifting e.g. ripping a CD onto a computer; and the resulting confusion about what is allowed and what is not risks that the law falls into disrepute].

I am delighted that the report advises that “copying should be lawful where it is for private purposes, or does not damage the underlying aims of copyright”.  I fully agree that these changes will “help to make copyright law better understood and more acceptable to the public”.
I really hope that the government will accept this report and implement its recommendations.

Do you remember this?

Image source:

If not, follow the image’s link to the Wikipedia article.

I love the “Home Sewing is Killing Fashion” parody 🙂