EBSCO EDS and Single Sign-On

OpenAthens Single Sign-On (SSO) is a SAML-compliant Shibboleth-type authentication method used for University login to a wide range of electronic resources.

SSO works by mediating between an identity provider (e.g. a university, checking that the user’s account is current), and a service provider (e.g. a database, to which the user’s university has a current subscription).  Here’s a diagram of the data flow:

Authentication data flow. Image credit University of Florida.

Authentication data flow. Image credit University of Florida.

Critically, the identity provider and the service provider don’t communicate directly.  The user’s personal credentials are not transmitted to the service provider; just that their identity has been verified.

This means that when someone logs in to a database or journal platform, they are greeted by “Welcome, University of Sunderland user” or “You are logged in as University of Sunderland”, but the database or platform does not know anything further about their identity.

Why does this matter?  Service providers’ servers may be located anywhere in the world, often outside the EU.  The Data Protection Act 1998 controls how personal information is used by organisations, businesses or the government.  It requires that data controllers (organisations etc) handle personal data according to people’s data protection rights, and do not transfer it outside the European Economic Area without adequate protection.

Recently, EBSCO have started promoting the use of an enhanced version of SSO which means that a user will be authenticated into EBSCO Discovery Service (EDS) and simultaneously logged in to their personal folders.  This will sound very appealing to many EDS customers, as currently the personal folders require the user to log in (again) with their EBSCOhost account (yet another userID and password to remember).  With the standard SSO setup, this would not be possible, so I started asking questions about what additional data exchange would be needed in order for the user to be individually identified.

Email from EBSCO:

Essentially the only requirement for setting up SSO is that your shibboleth releases a persistent unique ID. However we generally recommend releasing other attributes:

Which user data attributes must be included within the IdP-generated SAML assertion?

Only a unique user ID (e.g. employee ID, organization-specific email) is required to be sent in the SAML assertion. It is recommended that First Name, Last Name and Email also be sent to better support sharing and email from within the EBSCO user interface.

At the mention of persistent unique ID, I started to wonder about the data protection law implications.

I followed this up with a phone call, asking about compliance with data protection law.   It seems that this query hadn’t previously arisen in the UK, though it had in Scandinavia where they are more aware of the issues.  Safe Harbo(u)r was mentioned, but I pointed out that in 2015, the European Court of Justice declared invalid the Safe Harbor data-transfer agreement that had governed EU data flows across the Atlantic for some fifteen years.  I was directed to EBSCO’s White Paper about information security, but it didn’t mention anything about data protection.

In advance of last week’s EBSCO and OpenAthens webinar “Single Sign-On to a World of Knowledge“, I repeated my enquiry to OpenAthens and received the following:

All data that is given to OpenAthens is stored here in the UK. We provide the option of mapping attributes out to various publishers however this is controlled and decided by you. The default information that is sent to authenticate the user does not hold any data that identifies the user personally.

To me, “this is controlled and decided by you” sounds very much like ducking the question.

I appreciate that decisions on the release of personal data are ultimately the responsibility of the data controller, but I am concerned that neither EBSCO nor OpenAthens seem to acknowledge the legal and ethical difficulties that this presents to libraries having to make these decisions.  I believe that if they are advocating this enhanced use of SSO, they have a moral obligation to point out the data protection implications, even if they can’t advise libraries on these matters.

I would be grateful to hear from anyone who knows more about this – please leave me a comment.  Thanks for any wisdom you can offer!

Advertisements

Gatekeeping – usertypes and permissions

Adapted from a poster presentation given at an internal event at the University of Sunderland

How old do you have to be to…? [jurisdiction: England & Wales]

Apply to adopt a child / Become a blood donor / Buy fireworks / Choose your own doctor / Claim benefits, and obtain a National Insurance number / Get married (with parental consent) / Get married without parental permission / Go into a bar and order soft drinks / Have a tattoo / If you were adopted, you can see your original birth certificate / Join the armed forces (with consent of parent/s or carer) / Make a will / No longer entitled to free full time education at school / Open your own bank account / Order your own passport / Pawn things in a pawn shop / Play the National Lottery (though not place a bet in a casino or betting shop) / Supervise a learner driver (if held driving licence for same type of vehicle for 3 years) / Vote in local and general elections / Wearing a seatbelt is considered your own personal responsibility

How old do you have to be to…?

Here are the answers – did you get them all correct?

21 Apply to adopt a child / 17 Become a blood donor / 18 Buy fireworks / 16 Choose your own doctor / 16 Claim benefits, and obtain a National Insurance number / 16 Get married (with parental consent) / 18 Get married without parental permission / 14 Go into a bar and order soft drinks / 18 Have a tattoo / 18 If you were adopted, you can see your original birth certificate / 16 Join the armed forces (with consent of parent/s or carer) / 18 Make a will / 19 No longer entitled to free full time education at school / 18 Open your own bank account / 16 Order your own passport / 18 Pawn things in a pawn shop / 16 Play the National Lottery (though not place a bet in a casino or betting shop) / 21 Supervise a learner driver (if held driving licence for same type of vehicle for 3 years) / 18 Vote in local and general elections / 14 Wearing a seatbelt is considered your own personal responsibility

How old do you have to be to… – answers

Are these laws are consistent?  How is this related to the way in which they have developed?

Licences for electronic resources have evolved over time, and inconsistencies can appear because of historical precedent.  Consider the following table, showing a range of resources, and which types of people may access them:

Who do you need to be in order to access

This table is created by consulting the “authorised users” section of the licence for each resource.

Of those who are not current staff or students, it is walk-in users who receive the most generous entitlements.  This is because walk-in users have long been permitted to access print periodicals in academic libraries, and nowadays this is extended to include electronic journals (still within the library only).

The access entitlements of retired staff and “retired students” (i.e. alumni) are different, probably because it is assumed that retired staff will use this access to pursue academic research, whereas many alumni will be working in commercial settings.  If alumni were allowed access to their alma mater‘s academic subscriptions, this could damage the publishers’ income from commercial licences for their information products, so publishers do not permit alumni access for their products.  NB: some publishers allow alumni access for an additional fee, and usually for information resources for which there is no significant revenue from the commercial sector.

I’ve been working on a project to increase the granularity of our Single Sign-On authentication system, so that it can accommodate different types of users, and allow each group to access only the resources within its permission set.  I used this presentation to make the concept of usertypes and permitted resources more tangible, especially for people who don’t work in the e-resources (or indeed library) environment.

E-Resources – less frequently asked questions

This post follows on from E-Resources FAQ

A short history of remote or off-campus access

Eduserv developed the Athens system for remote access to e-resources.  It worked as a list of usernames and passwords hosted by Eduserv, and it allowed off-campus access without the need for VPN (which would authenticate the user via IP address).  VPN installation is not always easy (Mac users?) or possible (people in internet cafes or other places where they can’t download software onto the computer they’re using), and so was a great leap forward.

However, it was costly: JISC funded Athens access for UK higher education institutions and publishers also had to pay for it to work with their products.  JISC funded the access via Eduserv, but Athens was not a JISC product.

More recently, Shibboleth was developed as an open source software solution for web single sign-on for organisations, so it is free to use for both institutions and publishers.  In July 2008, JISC withdrew funding for Athens and started up their own access management organisation, The UK Access Management FederationAthens authentication continues to exist and is available on a subscription basis.

Hardly any US-based publishers (e.g. Highwire) used Athens, so switching to Shibboleth authentication meant that a wider range of resources was available off-campus than ever before.

Shibboleth is the technology that underlies our Oxford SSO (single sign-on) system.

What is EZproxy and how does it work with SSO?

EZproxy is another tool for remote access and it works by mimicking the Oxford IP range (like VPN):

EZproxy helps provide users with remote access to Web-based licensed content offered by libraries. It is middleware that authenticates library users against local authentication systems and provides remote access to licensed content based on the user’s authorization

Many e-journals and databases work with “Shibbolised” EZproxy, in which the proxy server is accessed via SSO.  The user is authenticated via SSO and then access to the proxy server is enabled, allows access to the resource via IP address authentication.  This means that IP-authenticated resources which aren’t SSO-compliant can be accessed off-campus using SSO via Shibbolised EZproxy.

E-resources access and walk-in users

EZproxy doesn’t kick in on-campus, so IP-authenticated resources allow walk-in user access.  In universities, walk-in users are visiting scholars or people with reader access who are not members of the University, and do not have SSO accounts.

Some publishers (usually in the legal or business fields) do not want to allow walk-in user access to their resources, so they require SSO authentication even on-campus.  Shibboleth access is secure and also gives them log files of user activity, so they can trace anyone they suspect of breaking the terms of their licence, for example by systematic downloading of their content.

Usernames and passwords

A few publishers still rely on username and password authentication based on usernames that they issue.  Typically, these are legal databases whose business model involves selling access to a few people at a variety of institutions in the commercial sector, and so they are not set up for other authentication methods.

These usernames and passwords are then stored on an SSO-protected website, such as Weblearn, our university’s virtual learning environment.

Other advantages of SSO over Athens

SSO provides more up-to-date authentication, as it retrieves user information from the identity provider each time access is requested.  The usernames and passwords hosted by Eduserv were only updated every month or so, so someone who had previously been a member of the University would often still be able to access resources for some time after they left.  SSO permissions can be finely tuned so that a student will lose their e-resources access immediately after finishing their course, but retain SSO access to their email until several months later.  Users are more aware of the value of their SSO, since it lets them in to so many services, and are less likely to share (or sell) it to other (non-University) people.  This had been a problem in the past with Athens usernames and passwords.

How Shibboleth works

The aim of a single sign-on system is to be able to access multiple resources with a single identity.  A variety of service providers (SPs, such as e-resources publishers) can sign up to work with Shibboleth, and a range of identity providers (IdPs, such as universities) can have users’ accounts verified by Shibboleth:

Shibboleth acts as a mediator between the services and the users (with different identities, affiliations and levels of permissions).  Therefore, when you access ScienceDirect via SSO, Shibboleth checks who you are and details about the service you are trying to access.  If it can identify you as a member of the University of Oxford and verify that the University has a current subscription to ScienceDirect, it will allow you access.

To reward you for reading this far, here’s a gory story about where the term shibboleth comes from.