bims: rfi pathways to ai enabled research
Title of submission
bims and NEP as expertise sharing systems
Describe the AI-enabled tool or application ( 200 words maximum)
[Note: this submission was written to inform, rather than impress. I use informal but precise language.]
We are in the same problem domain as meta.org, the Toronto-based startup bought by Chan-Zuckerberg. I suspect that it was sunset because it was too expensive and in a competitive field where many startups have failed.
We are still here. By “we” I initially mean “Bims: Biomed News”. We keep costs down by only using PubMed. We keep it simple by only dealing with the latest additions to PubMed. I wrote the entire technology stack. I admit that it’s a no-frills site. Our users are assumed to look for papers on same topic every week. So they ought not confuse the AI by looking for this today and for that tomorrow.
We see potential in bims because I found earlier success with “NEP: New Economics Papers”. NEP uses RePEc rather than PubMed. RePEc brings about 600 new papers a week as opposed to PubMed’s 30000. NEP covers close to 100 subfields of economics. The selections made by NEP editors are distributed to subscribers. They are reused in other parts of RePEc, e.g. to generate rankings of economists by subfield.
What is the value proposition and/or potential outcomes of your AI-enabled tool for facilitating the scientific process? ( 200 words maximum)
Bims and NEP users enjoy a value proposition hat-trick. First, users get a tool for the keeping up with the most recent literature unmatched by any search-based tool. Second, without any extra effort, users get a web page with all report issues. They can demonstrate that they are up-to-date. Third, I manage a bespoke mailing list system, where weekly report issues are sent to subscribers. Gaining subscribers can be extra work but once users have them, they get more name recognition.
Bims has an additional value proposition for non-academic users. These are patients with long-run diseases, their carers and their support organizations. We see great potential for support organizations for people with rare diseases. After some training, bims can easily find papers related to a disease even if the disease is not mentioned.
For me personally, bims and NEP are a loss proposition. They require the bulk of my labor force to build and further develop. True to the spirit of open science, we publish our attempts at getting funding at https://biomed.news/requests_for_support. We hope that as bims grows, we will eventually find a sponsor.
How will the tool support open science and/or expand access to the scientific process? ( 200 words maximum)
For the biomedical community, the central idea is that of expertise sharing. You are an expert and you demonstrate that by staying up-to-date. At the some time your selections diffuse your expertise. Bims experts can effortless practice open science.
For the economics community, NEP is part of the RePEc services. Over time, RePEc has published over 1 million working papers. RePEc services keep economists’ working paper culture thriving. In comparison, working papers in computer science have died out.
For all communities, the key issue to developing open science is giving people the incentive to practice it. Reviewing literature is a gentle introduction to the benefits of producing open science.
In a rejected funding application at https://openlib.org/home/krichel/proposals/tiumen.pdf I developed the expertise sharing idea, further, i.e., beyond a current awareness service. The idea is to use machine learning to build machine processable literature review objects. There can have a simple set structure of accepted and rejected documents. These objects can be combined by Boolean operators. This is would set another example of sustainable open science practice based on literature reviews.
How does the tool mitigate harmful uses or risk associated with the technology? ( 200 words maximum)
First, I use SVM rather than neural networks. SVMs do not hallucinate.
Second, yes, report editors may overlook a document that is relevant. But that is an error of the editor, not of the technology. Even if there are false negatives in the training data, machine learning has safeguards against overfitting. Improved machine learning and improved detection of what editors look at can reduce this risk further.
[Warning: this paragraph is somewhat difficult to understand.]
Third and most importantly, there are is a more profound reason why our systems are technologically risk-free. They are more human intelligence tools than artificial intelligence tools. Yes, AI is required to make them run. However, any AI can only be trained on past data. But the task of editors is to find what is new. New documents that contain only old ideas will appear at the very top of the AI-based rankings. Thus editors work against the AI to find the papers that are just below the very top. They have actually new ideas. This is something we can not automate. We need actual people. Taking part in these projects can be a gentle introduction to open science.
Progress made to date (200 words maximum)
NEP and bims have a long history.
In 1993, I started working on projects to improve the dissemination of working papers in economics via the then-emerging Internet. These project seeded the RePEc digital library. I created “NEP: New Economics Papers” in 1998. As RePEc grew, so did the workload of NEP editors. In 2003 I created a bespoke software tool called ernad. Its core component is a web-based report issue creation tool. The backend of that creation tool was the first purely AI-driven bibliographic retrieval system. Since 2016, I refactored ernad to allow it to be used on any XML-based bibliographic dataset, provided one adapts the templating.
In 2017, I found a biomedically trained person in Gavin McStay. Thus bims became a second ernad implementation. Keeping up with PubMed’s 30k new papers a week is notoriously hard. We have users who have been “bimsing” for years. But report number growth has been very slow.
In 2021, I received a generous €5000 grant from NlNet, to build a new email distribution system for bims and NEP. Thus nitpo appeared in my life. Nitpo replaced the Mailman based-distribution of NEP reports. Bims introduced emailed report issues via nitpo.
Anything else you would like to share?
For fun, you can look at https://biomed.news/rfi_pathways_to_ai_enabled_research_chatgpt for a ChatGPT-powered version of the response prepared by Anna Vainshtein, the selector of bims-moremu on “molecular regulators of muscle mass”. That version is based on an earlier draft of mine. I for one find that it aims to impress rather than inform.