At the moment one thing that is front and centre in my thinking about AI and machine learning in publishing and the scholarly ecosystem is how to make the case for ROI for investment in the technology, and more specifically investing in making data actionable.
Overall I think there is great promise for challenges like knowledge discovery and machine generated hypotheses, but there is massive potential for these technologies to also just make the quality of our work better, and to increase the value of our work by reducing and removing toil in the workplace.
The broader more aspirational kinds of goals of AI will probably by necessity be only initially available to a small number of people to work on, and that work will need to find a path to be translated to many other people through intermediaries, however it is in the smaller ambition of making our workplaces better that perhaps more people can be directly impacted in the short to medium term.
To get there though, there are a number of blockers that have to be overcome.
1) Finding capacity to find problems worth working on.
This applied not only to AI, but to any business improvement. Work tends to expand to fill the time available, and so designing time to look for efficiency gains can feel risky, especially when there is no immediately obvious solution. Our work can create data as a by-product, and our systems and ways of working can be amenable to improvement, if we take the time to look, but taking that time means that we have to go slower in some other areas for some period of time.
2) the expertise to understand how to work with data is expensive
Even after we find problems to work on, finding people with data science skills is hard. I think there are some areas of hope here. As data skills are becoming more available as core prats of university curricula, people entering the workforce are more capable. How do we create environments within our organisations that encourage these people to apply their skills to the work that they are doing? How do we leverage the skills that we have elsewhere and share them out within our organisations?
Helen King from BMJ pointed me towards Quartz AI Studio - Helping journalists use machine learning - an imitative to spur training around these technologies for journalism, something similar in scholarly publishing would not go amiss.
3) the cost of change of our systems can be expensive
Even after we have identified a problem, applied the skills to that problem and created a prototype solution, institutional inertia can make it hard to roll out a change. We need to encourage a change in how our entire organisation thinks about change to support this kind of internal innovation. One perspective on this is
rightsifting Rightshifting | Think Different.
4) we may have to change how we think about our data.
Data in our companies is processed under a single use case framework. Need to do X, make the data look like Y. To support machine learning we often need the data to be put into a more general form, we often need to think about capturing data with a view to unspecified uses later. When raising requests like this, there can often be pushback about taking on work now for an unspecified future benefit. Overall I’m a fan of YAGNI - but in this case there is some other weighting of effort that might be useful.
5) data is infantile, algorithms are just immature.
Working with data is working with a moving target. A good friend of mine describes it like this. Looking after a classical algorithm that you have created is like looking after a pre-teen. It needs some care attention, but give it an iPad and some crisps and sit it in a corner and it will probably be OK as long as you check in every now and again.
Working with a machine learning system where you have to take care about the data is like looking after a toddler. You know the toddler is going to shit it’s nappy, you just don’t know when. The best you can hope for is that the nappy has no shit in it right now, but you know you are going to have to deal with a shitty nappy before too long.