*with apologies to Jurassic Park
What is RAPTOR?
RAPTOR is a JISC-funded kit for looking at e-resources statistics. RAPTOR stands for Retrieval, Analysis, and Presentation Toolkit for usage of Online Resources. It was a JISC-funded project led by the University of Cardiff – read more about the project here. This post summarises my notes from the RAPTOR workshop in Birmingham earlier this week, delivered by Dr Rhys Smith and Dr Phil Smart of the University of Cardiff. The first version of RAPTOR was released in 2011. Institutions have multiple authentications systems (e.g. Shibboleth, IP), and each logs usage by username. However, each of these logs are on different systems and in different formats, and some info is missing (e.g. usernames, departments). Federation operators have a need for stats to demonstrate value for money to their funders. RAPTOR is a piece of software which allows these usage logs to be collated.
- easy to install & configure
- not intrusive
- web front-end for non-tech users
- standards-based where possible
- free to use
- open source
Client (ICA – information collector agents) sends info to the server (MUA – multi-unit aggregator; web):
This picture isn’t as good but it captures Phil doing RAPTOR hands:
RAPTOR is a set of Java programs. Each competent runs on its own Jetty instance. Public/private keys, SSL handshakes. Working on exposing MUAs to SAML metadata instead of keys.
Supported authentication systems
- Shibboleth IdP
And soon to include OA LA (OpenAthens), OA LA proxy, simpleSAMLphp, Radiator – plus anything you can manually configure. You can configure RAPTOR to parse any log file you like, you just need to be brave.
Application of RAPTOR
More information about usage, enriched with identity info, gives more business intelligence. RAPTOR can currently pull out department and affiliation from the IdP [identity provider]. This could be extended in future to include other attributes – let the RAPTOR team know your requirements.
Can use the data to show usage of e-resources by department, system use by affiliation (e.g. UG/PG/staff) e.g. PC cluster room usage. Could map e-resources usage to attainment info – caveat of correlation not causation. SWITCH is the Swiss version of JANET – SWITCH AMAAIS [Accounting and Monitoring of AAI Services] project is doing similar things to RAPTOR.
The RAPTOR-JUse project aims to integrate stats from people and platforms by combining data from RAPTOR about the activity of individuals (via the IdP) and data from JUSP [Journal Usage Statistics Portal] about journal usage stats from the SP [service provider] end.
RAPTOR and JUSP have different reporting periods – RAPTOR is per event; JUSP uses defined reporting periods. This is just one example of the issues to be overcome in this project.
Demo of RAPTOR
The RAPTOR login page is comfortingly simple – though you can’t use federated login (for now). The irony was acknowledged 🙂 After logging in, you will see something like this:
Can you spot the summer holidays trough on this graph?
You can add postprocessors to sort rows, extract top 10 only etc. It’s possible to format the entity IDs with SAML organisation name. The team hope to develop a layer in RAPTOR to represent stats by affiliation as a proportion of the total users, not just raw number.
RAPTOR data can be downloaded in .xlsx .csv and .pdf formats. It’s not (yet?) possible to see total combined stats for different authentication mechanisms through the web interface – the problem is caused by the different host names being owned by different publishers. If unique IDs are brought in for publishers in future, this would then be possible. For any users who’ve dropped out of the directory, no values will be recorded.
- simple – good for test deployment but won’t scale well
- normal (one ICA on each service to monitor, MUA & web on a Raptor-server server (sic)) – good for large deployment, production use
- completely separate (ICA, MUA, web elements all on different servers) – probably overkill for most situations
RAPTOR local deployment options in diagrammatic form:
For different components to talk to each other, they need to know each other’s host name, and have encryption keys to swap. Could have Shib/IP info going to different MUAs.
What do you want from RAPTOR?
Ease of config, supported systems, look & feel, dashboard, reporting vs graphing…? Let the team know what enhancements you would like to see! Tell them via the RAPTOR wiki.
WUGEN and WAYFless URLs
To explain what a WAYFless URL is, it’s best to begin with explaining what a WAYF URL is. WAYF stands for Where Are You From, and it’s a type of URL that allows you access to a service provider via single sign-on by including a step where you have to choose your institution/organisation – hence “where are you from?”. Therefore, a WAYFless URL is one which does not ask you for your institutional affiliation, and bypassing this step makes it easier and quicker for your users to access platforms.
Setting up service providers to work with your identity provider often involves building WAYFless URLs that are specific to your organisation. However, they can be brittle and prone to breaking if the target platform changes domain name structure.
And that’s where WUGEN comes in. WUGEN [WAYFless URL Generator] is a tool for building robust WAYFless URLs. The site leads you through a few steps and builds the URL for you.
Click on “Explain my WAYFless URL” to see a rating of the URL on the reliability thermometer:
Thanks Rhys and Phil for an excellent workshop 🙂 Before I left, there was time for a final RAPTOR hands moment: