SSO works by mediating between an identity provider (e.g. a university, checking that the user’s account is current), and a service provider (e.g. a database, to which the user’s university has a current subscription). Here’s a diagram of the data flow:
Critically, the identity provider and the service provider don’t communicate directly. The user’s personal credentials are not transmitted to the service provider; just that their identity has been verified.
This means that when someone logs in to a database or journal platform, they are greeted by “Welcome, University of Sunderland user” or “You are logged in as University of Sunderland”, but the database or platform does not know anything further about their identity.
Why does this matter? Service providers’ servers may be located anywhere in the world, often outside the EU. The Data Protection Act 1998 controls how personal information is used by organisations, businesses or the government. It requires that data controllers (organisations etc) handle personal data according to people’s data protection rights, and do not transfer it outside the European Economic Area without adequate protection.
Recently, EBSCO have started promoting the use of an enhanced version of SSO which means that a user will be authenticated into EBSCO Discovery Service (EDS) and simultaneously logged in to their personal folders. This will sound very appealing to many EDS customers, as currently the personal folders require the user to log in (again) with their EBSCOhost account (yet another userID and password to remember). With the standard SSO setup, this would not be possible, so I started asking questions about what additional data exchange would be needed in order for the user to be individually identified.
Email from EBSCO:
Essentially the only requirement for setting up SSO is that your shibboleth releases a persistent unique ID. However we generally recommend releasing other attributes:
Only a unique user ID (e.g. employee ID, organization-specific email) is required to be sent in the SAML assertion. It is recommended that First Name, Last Name and Email also be sent to better support sharing and email from within the EBSCO user interface.
At the mention of persistent unique ID, I started to wonder about the data protection law implications.
I followed this up with a phone call, asking about compliance with data protection law. It seems that this query hadn’t previously arisen in the UK, though it had in Scandinavia where they are more aware of the issues. Safe Harbo(u)r was mentioned, but I pointed out that in 2015, the European Court of Justice declared invalid the Safe Harbor data-transfer agreement that had governed EU data flows across the Atlantic for some fifteen years. I was directed to EBSCO’s White Paper about information security, but it didn’t mention anything about data protection.
In advance of last week’s EBSCO and OpenAthens webinar “Single Sign-On to a World of Knowledge“, I repeated my enquiry to OpenAthens and received the following:
All data that is given to OpenAthens is stored here in the UK. We provide the option of mapping attributes out to various publishers however this is controlled and decided by you. The default information that is sent to authenticate the user does not hold any data that identifies the user personally.
To me, “this is controlled and decided by you” sounds very much like ducking the question.
I appreciate that decisions on the release of personal data are ultimately the responsibility of the data controller, but I am concerned that neither EBSCO nor OpenAthens seem to acknowledge the legal and ethical difficulties that this presents to libraries having to make these decisions. I believe that if they are advocating this enhanced use of SSO, they have a moral obligation to point out the data protection implications, even if they can’t advise libraries on these matters.
I would be grateful to hear from anyone who knows more about this – please leave me a comment. Thanks for any wisdom you can offer!