Conflicting priorities on information security

EBSCO have just released a White Paper “from our partner, OpenAthens”, The Evolution of Authentication and the Importance of Information Security.

The focus is very much on the information security of EBSCO’s subscription content.  There is no mention of user privacy, despite the fact how individuals want their data to be used is often in conflict with how corporations want to use this information.

Rather like the Leave campaign’s messages that voting for Brexit would be all gains and no losses, ignoring the complexity of complex decisions creates blind spots and vulnerabilities in systems and societies.  I would like politicians and corporations to stop patronising us with simple, comforting, false solutions and engage bravely and intelligently with difficult decision-making.

Observe what happens if you click on “Download your copy for free today to continue reading”:

Please fill out the form to receive your free copy of The Evolution of Authentication and the Importance of Information Security. Fill out the form and immediately receive the white paper. The fields requested are: Name, Email, Organization Name, City, Phone. All fields except Phone are required.

je dis ça, je dis rien

See also: EBSCO EDS and Single-Sign On, and Consumer democracy? (reference to Adam Curtis’ film Bitter Lake, describing how politicians create oversimplified good vs evil stories rather than confronting the realities of a complex world).

EBSCO EDS and Single Sign-On

OpenAthens Single Sign-On (SSO) is a SAML-compliant Shibboleth-type authentication method used for University login to a wide range of electronic resources.

SSO works by mediating between an identity provider (e.g. a university, checking that the user’s account is current), and a service provider (e.g. a database, to which the user’s university has a current subscription).  Here’s a diagram of the data flow:

Authentication data flow. Image credit University of Florida.

Authentication data flow. Image credit University of Florida.

Critically, the identity provider and the service provider don’t communicate directly.  The user’s personal credentials are not transmitted to the service provider; just that their identity has been verified.

This means that when someone logs in to a database or journal platform, they are greeted by “Welcome, University of Sunderland user” or “You are logged in as University of Sunderland”, but the database or platform does not know anything further about their identity.

Why does this matter?  Service providers’ servers may be located anywhere in the world, often outside the EU.  The Data Protection Act 1998 controls how personal information is used by organisations, businesses or the government.  It requires that data controllers (organisations etc) handle personal data according to people’s data protection rights, and do not transfer it outside the European Economic Area without adequate protection.

Recently, EBSCO have started promoting the use of an enhanced version of SSO which means that a user will be authenticated into EBSCO Discovery Service (EDS) and simultaneously logged in to their personal folders.  This will sound very appealing to many EDS customers, as currently the personal folders require the user to log in (again) with their EBSCOhost account (yet another userID and password to remember).  With the standard SSO setup, this would not be possible, so I started asking questions about what additional data exchange would be needed in order for the user to be individually identified.

Email from EBSCO:

Essentially the only requirement for setting up SSO is that your shibboleth releases a persistent unique ID. However we generally recommend releasing other attributes:

Which user data attributes must be included within the IdP-generated SAML assertion?

Only a unique user ID (e.g. employee ID, organization-specific email) is required to be sent in the SAML assertion. It is recommended that First Name, Last Name and Email also be sent to better support sharing and email from within the EBSCO user interface.

At the mention of persistent unique ID, I started to wonder about the data protection law implications.

I followed this up with a phone call, asking about compliance with data protection law.   It seems that this query hadn’t previously arisen in the UK, though it had in Scandinavia where they are more aware of the issues.  Safe Harbo(u)r was mentioned, but I pointed out that in 2015, the European Court of Justice declared invalid the Safe Harbor data-transfer agreement that had governed EU data flows across the Atlantic for some fifteen years.  I was directed to EBSCO’s White Paper about information security, but it didn’t mention anything about data protection.

In advance of last week’s EBSCO and OpenAthens webinar “Single Sign-On to a World of Knowledge“, I repeated my enquiry to OpenAthens and received the following:

All data that is given to OpenAthens is stored here in the UK. We provide the option of mapping attributes out to various publishers however this is controlled and decided by you. The default information that is sent to authenticate the user does not hold any data that identifies the user personally.

To me, “this is controlled and decided by you” sounds very much like ducking the question.

I appreciate that decisions on the release of personal data are ultimately the responsibility of the data controller, but I am concerned that neither EBSCO nor OpenAthens seem to acknowledge the legal and ethical difficulties that this presents to libraries having to make these decisions.  I believe that if they are advocating this enhanced use of SSO, they have a moral obligation to point out the data protection implications, even if they can’t advise libraries on these matters.

I would be grateful to hear from anyone who knows more about this – please leave me a comment.  Thanks for any wisdom you can offer!

What is hyperauthorship?

Hyperauthorship

Historically, authorship of a journal article referred to those who contributed to the writing of the document.  More recently (and especially in the sciences because of the nature of the subject) authorship attribution is extended to a larger number of people who have contributed to the research behind the article.  Hyperauthorship refers to articles with more than 50 authors.  This 2015 physics paper lists over 5,000 authors.

It may be that there are over 50 people making a legitimate contribution to a paper, but in the context of citation metrics where researchers’ success is measured by the number of times a publication in their name has been cited, it is easy to see the potential for gaming the system.

This reminded me of Goodhart’s law: “When a measure becomes a target, it ceases to be a good measure.”

Contributorship and attributions models

Project CRediT arose from a workshop involving stakeholders interested in exploring contributorship and attribution models, and a working group developed a controlled vocabulary of roles that could be used to describe typical research “contributions” – here is the draft taxonomy.

PLOS Journals have their own taxonomy, and ORCID now supports the display of contributorship open badges on ORCID records.

Contributorship badges - image credit http://orcid.org/blog/2015/08/11/contributor-recognition-update-orcid-project-credit-and-contributorship-badges

Contributorship badges – image credit ORCID blog (Laura Paglione)

Further reading

Journey to Full Text Finder – arrival in the Celestial City

Following my presentation Journey to Full Text Finder: A Pilgrim’s Progress at the EDS conference in July, here’s an update on how I got on with the migration from the old EDS to the new Full Text Finder (FTF) version.  Thanks again to John Bunyan’s The Pilgrim’s Progress (1678) for the inspiration for the title of this post.

On the whole, everything went smoothly, and I would particularly like to thank Seoud, Abid, and Adam at EBSCO for their help throughout the process. I have written this summary to help other people know what to expect from the process, and in particular the cascade effects on data checking and linking for which it is essential to set time set aside.

Side by side

Our migration to EBSCO Discovery Service Full Text Finder began with a visit from Seoud.  He talked me through the steps, and we agreed the timings.  We also discussed running the old and new EDS alongside each other, to allow students completing their courses in July and August to continue using the old version, while having the new version available for testing and experimentation before the beginning of the new academic year in September.

I asked a colleague in IT to create a redirect URL for EDS FTF which I then used wherever required, and this saved time later in updating URLs individually.

Data migration and checking

Following the data migration, our Periodicals Librarian spent time checking that our subscriptions in EDS FTF matched the old system, focusing on a few known trouble spots e.g. where we have single title subscriptions rather than a whole package. In some cases, a journal which existed from e.g. 1997-present but for which we only have access from 2015-present had been enabled for the full run, and not the years to which we have access.  This was particularly frustrating for titles for which our subscription is administered through EBSCONET, as EBSCO clearly have correct information in their system about our entitlements, but it was not being migrated or applied accurately.  Although this happened only for a small number of titles, there did not appear to be any pattern to predict which would be affected, and so all had to be checked.

Admin interfaces

Our test EDS FTF was up and running in June.  The old EDS was controlled through two admin interfaces – EBSCOadmin to control EDS itself, and A-to-Z Admin to control journals and databases A-to-Z lists.  The areas of overlap (and not) of the two admin mechanisms were sometimes unclear to me. The new EDS FTF is administered via EBSCOadmin, and it is great to have just one admin interface to drive this system.  As we had old and new EDS running simultaneously for a period, any changes would have needed to have been made to each system separately.  I decided that from the launch of new EDS FTF, any changes would only be made to the new EDS, and this did not cause any problems.

ebscoadmin

Databases A-Z

The new EDS FTF has access to journal titles via the Publications link, but it has no Databases A-Z feature.  I have been told by EBSCO that they did not include this feature, because only librarians wanted it, not students (and also perhaps because there is less demand for it in EBSCO’s primary market, the USA).  However, there are some essential databases which are not indexed by EDS (such as Westlaw) and our users must have a route to access these.  EBSCO have an A-Z solution which can be added to your EDS FTF but you have to request it, and it is basic (just a list of links, like in the old days…).  It also has its own admin interface.

a-z

Linking

Permalinks created in the old EDS are different from permalinks in the new EDS FTF.  EBSCO have redirects in place, but “there is no timescale of how long these will be in place”.  It is therefore necessary to create new permalinks anywhere these are used, such as reading lists.  We also had links to journal titles using a linking template that worked with the old journals A-to-Z, and these had to be re-created based on the new “Publications” journal-finding tool in EDS FTF.  This was more urgent, as no redirects would be in place for the old journals A-to-Z.  This can add considerably to the workload of staff who maintain reading lists.

permalink

Google Scholar

If your library is set up for “Library links” to allow your users to use the library’s link resolver with Google Scholar, your settings will need to be updated to reflect the new resolver within EDS FTF.  EBSCO told me that this update would be included in the migration.  When I contacted EBSCO for confirmation, they confirmed that our resolver in Scholar had been updated to Full Text Finder, but that it could take 1-2 weeks for the changes to take effect, suggesting that my enquiry had prompted a change in the settings rather than this happening without my intervention.

scholar

Reblogged: Time for Elsexit?

Earlier this week, Timothy Gowers posted “Time for Elsexit?” about the new Elsevier deal negotiated with Jisc.  It’s not often that I can cater for my readers interested in Brexit and scholarly publishing simultaneously (enjoy!).  I found the parallels with Brexit interesting, and it’s an excellent summary of the problems that persist in the new deal.

Here is some background to the situation

  • Elsevier is one of the world’s major providers of scientific, technical, and medical information.
  • ScienceDirect is their main platform (website), which provides subscription-based access to a large database of articles and other research. Despite the name, it covers a wide range of subject areas.
  • Jisc Collections is the negotiation and licensing service that supports the procurement of digital content for higher education and research institutions in the UK.
  • A Big Deal is a subscription to most of a publisher’s content as a package, rather than having subscriptions to individual journals.  Publishers often swap titles in and out of the package.
  • Historic spend refers to a figure for each university, established at the point when Big Deals were launched (circa 1997).  Elsevier’s contract requires each subscribing university to match or exceed their historic spend, thus controlling cancellations, as cancellations of individual title subscriptions do not result in lower subscription fees.
  • Why the secrecy? Mike Taylor explains: “when negotiating contracts with libraries, publishers often insist on confidentiality clauses — so that librarians are not allowed to disclose how much they are paying. The result is an opaque market with no downward pressure on prices, hence the current outrageously high prices, which are rising much more quickly than inflation even as publishers’ costs shrink due to the transition to electronic publishing.”

Further reading

  • Serials crisis – the chronic subscription cost increases of many serial publications such as scholarly journals
  • The Cost of Knowledge – a protest by academics against the business practices of academic journal publisher Elsevier
  • Elsevier journals — some facts – including the following questions: How willing would researchers be to do without the services provided by Elsevier?  How easy is it on average to find on the web copies of Elsevier articles that can be read legally and free of charge?  To what extent are libraries actually suffering as a result of high journal prices?  What effect are Elsevier’s Gold Open Access articles having on their subscription prices?  How much are our universities paying for Elsevier journals?

Update: Martin Paul Eve, Jonathan Tennant, and Stuart Lawson have referred Elsevier/RELX to the Competition and Markets Authority on the grounds of abuse of a dominant market position, and problems in a market sector.

Megajournals and how to spot them in the wild

The first megajournal, PLOS One, launched in 2006.  Since then, the presence of megajournals in the Open Access (OA) landscape is growing, and it’s increasingly important to know how megajournals differ from traditional journals:

  • when considering a paper for publication, peer-reviewers consider only whether it is technically sound, whereas traditional peer-review also has requirements for novelty, importance, or interest to a particular community
  • megajournals accept papers from a broad range of subjects (look out for “full spectrum”, “all areas”, “multidisciplinary”)
  • many megajournals’ funding model is to charge fees for publication – article processing charges (APCs) – and they typically charge lower APCs than traditional (hybrid) journals (average APC for full OA journal £1,354 compared with £1,882 for hybrid – Jisc data from 2014-15)

These factors lower the bar for publication and may make these journals more attractive places for researchers to publish.  You can imagine the types of arguments that ensue about whether this sets the bar too low, or helps researchers with less funding to get published; and whether the different requirements at the peer review stage allow megajournals to be flooded with poorer-quality/lower-value articles or whether it breaks the stranglehold of academic hierarchies on what counts as valid research…

If megajournals don’t limit the number of articles in each issue, there is also the potential conflict of interest arising from money to be made from every article accepted for publication.  Traditional journals usually have a limit, which (hopefully) means their APC income generated from each issue published is constant, and papers submitted are judged purely on their own merits (but what happens if the supply of high-quality papers is greater than the journal can publish?).

Some things to consider:

  • The platform (or publisher’s name) has long been considered a proxy for the quality of the research it publishes.  To what extent is this still the case?
  • How are new publications to prove their worth?  To what extent are predatory publishing practices found?
  • How are we to assess the trustworthiness of a journal?  The reputation of the peer reviewers is often the best guide, and this requires good knowledge of the field and the people involved.  This is where discussions with academics in each department are essential in establishing the value of a megajournal to a given subject area.

Think.Check.Submit. is a campaign to help researchers identify trusted journals for their research – it’s a checklist researchers can use to assess the credentials of a journal or publisher.  It has some useful questions to use as a starting point for discussions with academics about judging journal quality.

Further reading

Consumer democracy?

I have recently discovered the documentary films of Adam Curtis and can highly recommend “The Century of the Self” (2002) – it’s available on YouTube and the four 1-hour episodes are:

  1. Happiness Machines
  2. The Engineering of Consent
  3. There is a Policeman Inside All Our Heads; He Must Be Destroyed
  4. Eight People Sipping Wine in Kettering

The last 10 minutes gives an excellent summary of how politicians adopt methods used by business (e.g. focus groups) to give voters what they said they wanted, but this consumerism allows people only the illusion of control.  Rather than people being in charge,  their desires are.  They exercise no decision-making power, and democracy demands no acts of citzenry but treats the public as passive consumers.  Responding to a mass of ever-changing and out-of-context individual opinions is very different from having a leader with a coherent plan.

This made me think of the way student feedback may be treated in universities, and whether it is used to inform or guide planning.  I’m all for a higher education sector which responds to student feedback, but I think consumer-driven universities risk focusing on students’ short-term desires at the expense of delivering the kind of challenging and transformative experience that produces confident graduates with useful skills.

See also last Friday’s episode of The Now Show, in particular Andy Zaltzman’s segment (begins at 17:45) about improv politics, and Pippa Evans’ song (26:42) “I’ve got an opinion, everybody listen to me…”

Curtis’ more recent films Bitter Lake and HyperNormalisation are currently available on BBC iPlayer.