Use of a Wiki-Based Software to Manage Research Group Activities Previous Contents Next Issues in Science and Technology Librarianship Summer 2014 DOI:10.5062/F4KS6PJ1 Use of a Wiki-Based Software to Manage Research Group Activities Ting Wang CLIR Data Curation Fellow Library and Technology Services Ting_Wang@alumni.unc.edu Dmitri V. Vezenov Associate Professor Department of Chemistry dvezenov@lehigh.edu Brian Simboli Science Librarian Library and Technology Services brs4@lehigh.edu Lehigh University Bethlehem, Pennsylvania Abstract This paper discusses use of the wiki software Confluence to organize research group activities and lab resources. Confluence can serve as an electronic lab notebook (ELN), as well as an information management and collaboration tool. The article provides a case study in how researchers can use wiki software in "home-grown" fashion to organize their research activities and how librarians can play a role in exploring or advocating wiki software for these purposes. Most of the focus in our discussion will be on ELNs, but we will also address other uses of Confluence. Introduction In collaboration with the Council on Library and Information Resources (CLIR) and with support from the Alfred P. Sloan Foundation, Lehigh University's Library and Technology Services (LTS) supported the work of a postdoctoral curation fellow in exploring how to provide data curation services to support faculty research. The fellow worked with the science librarian in conducting interviews with principal investigators in the earth and environmental sciences, biology, chemistry, and physics departments about practices in data management (Sferdean & Wang 2013; Wang & Branch 2013). The purpose of the interviews was to identify data management issues and researchers' needs through the entire data lifecycle (Wang & Simboli 2013). The interviews revealed many issues that arise at different stages of the data lifecycle, including metadata creation, data quality control, data access and sharing, data searching, data analysis, and data storage and preservation. The interviews also indicated that the needs and best practices of data curation vary by discipline, sub-discipline, or even by project. The interviews underscored that no single piece of software can address the research data management need for every discipline throughout the data lifecycle. A flexible and comprehensive platform for organizing the activities of a research group is critical for effective control of data it generates and requires for its day-to-day functioning. Laboratory Notebooks: Overview and Literature Review Laboratory notebooks play a critical role in the research lifecycle and fill a major role in data management, particularly of pre-publication research data (Briney 2012). Laboratory notebooks assume the form either of paper laboratory notebooks (PLNs) or of electronic laboratory notebook (ELNs). PLNs are the traditional notebook format and still popular in academic labs. With more and more data collection becoming digital, PLNs are less frequently used to record raw data, but are still the primary place to record experimental procedures and observations, instrument use logs, lab protocols, freehand notes, and 'print-cut-paste' graphs. ELNs play the same roles as a PLN but are superior in several important ways. Using keyboard, mouse, microphone, or scientific instruments, ELN users can input research data and records in basic formats (e.g., text, tables, images, and graphs) and, depending on the software capabilities, in special formats such as equations and signatures. In addition to easy and accurate input, ELNs also allow quick access and sharing among collaborators regardless of their geographic distance as well as easy integration with other laboratory informatics tools (Machina & Wild 2013). Therefore, ELNs address the growing challenges of data sharing, open science, research reproducibility, data retrieval, and other data curation issues (Bird et al. 2013a; b), thereby contributing to increased research productivity. There are dozens of ELN solutions; each has advantages and disadvantages. For academic libraries that play the major role in funding ELNs and archiving ELN data, the ideal ELN solution not only addresses researchers' needs, but is also affordable and sustainable for libraries. Literature Review ELNs have long existed to improve research performance and have been widely used in private sector research and development to meet legal and regulatory requirements (Nehme & Scoffin 2006). The recently increasing use of touch-screen devices has further stimulated the development of ELN software and expedites the transition of the traditional PLN-based laboratory toward a paperless laboratory (Giles 2012). Rubacha et al. (2011) reviewed 35 ELN solutions available in the market and classified them into five major categories based on their market audience: research and development labs, quality assurance/quality control, biology, chemistry, and multidiscipline. Many of these commercial ELN solutions, such as e-CAT, EverNote, Ilabber, ipad ELN, LABTrack, and Labguru, also provide cloud services priced on the basis of storage size, number of users, period of use, or license. In contrast to wide adoption in industry, ELN systems have met with some resistance in academia. Richardson (2009) found that among the respondents (at two universities) "over 90% of respondents indicated they used a paper laboratory notebook when recording experimental notes." However, "while the paper-based lab notebook is not dying off, additional electronic means of recording information are beginning to take hold" (p. 39). He pointed out that, given the freedom and openness of the academic environment, academic researchers do not adopt the "one-size-fits-all" ELN solution of industrial researchers. Implementing ELNs in academia also faces other additional challenges such as expenses and time, customization of a commercial ELN, integration with specialized instruments and other laboratory systems (Drake 2007; Machina & Wild 2013). Also, the aforementioned Lehigh interviews indicated that of those faculty members in biology, chemistry, and earth science who mentioned using laboratory notebooks, all used the print format. All these labs except one, that of Professor Vezenov (which uses Confluence), also used Dropbox and Google Drive for collaboration. The interviewees are inclined to use PLNs because they enable freehand annotation, are low cost and easy to carry, have no learning curve, and do not require much administrative oversight. So despite the major drawbacks of PLNs for data input, search, use, access, and preservation, many researchers still utilize PLNs rather than switch to ELNs. However, in the current digital era, as data collection and research collaboration continue to take digital forms, the transformation from PLNs to ELNs may become increasingly likely, or (as discussed below) new ways will emerge to exploit the benefits of ELNs even while initially recording data in the form of PLNs. The researchers who already use ELNs in their research usually choose to customize or develop their own ELN solutions or workflows (Myers 2001; Emerging Technologies for Researchers 2011; Hayes 2012; Walsh & Cho 2012; DoIT 2013; Milsted et al. 2013; Voegele et al. 2013). Some implement ELNs developed in-house, such as the Pacific Northwest National Laboratory ELN (Myers et al. 2001), and a bioscience research laboratory LabTrove (Milsted et al. 2013). Some academic labs use existing open-source or commercial software to implement ELNs, such as MediaWiki ELN (Emerging Technologies for Researchers 2011), Microsoft OneNote ELN (Hayes 2012), Evernote ELN (Walsh & Cho 2012), eCAT ELN (DoIT 2013), and WordPress ELN (Voegele et al. 2013). Clearly, it is difficult for campuses with budgetary constraints to support so many ELN products. In particular, it is not cost effective for them to purchase subject-specific software for each department or disciplinary concentration at an institution. We therefore conclude that, at least for a mid-sized university such as Lehigh, it is important to support an ELN that provides basic functionality and that is easy to use, flexible enough to be applied in most labs, and affordable and sustainable. Vezenov Group Use of Confluence for ELNs In the course of doing data interviews, the CLIR fellow and science librarian learned that Lehigh chemistry professor Dmitri Vezenov has been using the wiki software Confluence to support all aspects of his research group's activities. (For the research areas of his group, see http://www.dvezenovgroup.org/). Some wiki software is free and open source, for example, the software that Wikipedia uses, MediaWiki. Some wiki software, such as Confluence, Jive, and SharePoint, is proprietary and mainly used as enterprise software. Prof. Vezenov was inspired to use Confluence by a similar use of a wiki platform made at Princeton by the physics professor Joshua Shaevitz. A web search indicates that many institutions have subscribed to Confluence, e.g., Cornell University, Indiana University, Purdue University, Stanford University, Tufts University, and University of Arizona. While many academic institutions appear to use Confluence as a collaboration tool, we are not aware that other institutions use Confluence in ways similar to those of the Vezenov group. Vezenov started implementing his group's use of Confluence for lab notebook and other purposes in the Summer of 2011. With support from a LTS computing consultant, he created a Confluence lab space that: (1) provides the standard functions of a notebook, such as logging daily experiments, procedures, results, data, and ideas; (2) enables information management, including standard lab protocols, chemicals inventory, safety information, manuals for instrument operation, calendar/scheduling of instrument time and group meetings, depositing of group products (such as slides for group meetings or published papers); and (3) serves as a platform on which group members can exchange ideas about, and collaborate on, project planning, computer code, research papers, presentations, and grant proposals. Implementing Confluence for these purposes is easy. Users only need to create and edit pages and then save them when they finish their input (Figure 1). Essential page functions such as page browsing, creating and searching are always visible at the menu bar at the top part of the space. The narrow sidebar on the left-hand side exhibits a hierarchical page structure, providing an alternative way for users to browse, search, and find pages. The rest of the page can be used to record lab notes and other research-related content. Users have a great deal of flexibility in how they enter and format their lab notes and data; they can type text and comments on the page and can insert tables, images, links, calendars and other available macros (i.e., a series of commands and functions). Research group participants can exploit this flexibility to make ELNs that facilitate efficiently and comfortably their individual laboratory workflows. The landing page of Vezenov's Confluence site includes: guidelines for creating an ELN, lab announcements, a group member directory that links to ELN pages of individual researchers, shortcuts to different lab resources (e.g., notebook archives, lab inventory, meetings, standard operating procedures, etc.), and frequently-used links to other institution resources (e.g., Lehigh University's Library Chemistry guide page; Web of Science; SciFinder; and e-Handbook of Chemistry). Therefore, the landing page works as a table of contents, providing an overview of the lab space. At the sidebar, all the lab space content is grouped into two major categories: "Vezenov Lab Notebook" and "Vezenov Lab Resources" (Figure 1). The lab members may navigate these major pages and any of their child pages at the navigation sidebar. The sidebar navigation is generated automatically from the hierarchy of the space. Under the "Vezenov Lab Notebook" page, each group member creates their individual lab notebooks. When any member leaves the lab, their lab notebooks will be archived at the "Inactive group members," but the information remains fully accessible to the current members. Under the "Vezenov Lab Resources" page, the group gathers all the useful research resources including papers, calendars, code, equipment list and manuals, inventory, manuscripts and presentations, and vendor information. Figure 1. Home page of the Vezenov group's Confluence site. A new lab member who create an ELN may use the existing ELN templates created by other lab members, or instead create a new template and customize pages to individual tastes (see an example of a typical template in Figure 2). The template function allows them to reuse the notes structure every time they create a new page and therefore saves a lot of time and guides execution of routine experiments and common lab protocols. The ELN entries can be organized by project (as child pages of the top notebook page) and by date within the project (again, as child pages). On each lab notes page, users usually will record project/experiment name, date, purpose, materials, procedures, results, and conclusions. They can organize data into tables and upload and attach figures and equations into the lab notes if necessary. If they generate data during experiments, they can also directly type metadata information into the lab notes for the generated data files. Figure 2. ELN example for a specific experiment. In the process of using Confluence as an ELN, Vezenov's group also gained insights about the advantages and limitations of this ELN solution. Advantages: Flexibility. Users can design their own ELN style and derive new ELN templates from the basic templates provided by Confluence (e.g., blank Page, decision, file list, how-to article, share a link). Therefore, researchers have considerable freedom to develop ELNs suitable to their research workflow. Little administration is required beyond setting up the initial structure of the space. Ease of use. Confluence is based upon HTML editing and requires only a browser for content entry. Therefore, beginners will find that creating ELNs is easy and intuitive. Computing consultants can help users come up to speed, but the learning curve is not prohibitive. Once a few members of a research group have created their ELN, new members can learn best practices by inspecting the content in existing ELN pages. Enhanced accessibility. Users can access their ELN from any computer with Internet access. Within an ELN, users also have easier access than in a PLN to certain specific pages (such as inventories) by using the link and search functions. Users can use the link function to set up cross-references between experiments and procedures (e.g., to avoid writing repetitive standard protocols). To find a page, users just need to type a keyword or phrase into the search window. By browsing the entire space, users can also keep track of new developments in the group. Enhanced collaboration. In addition to creating and sharing ELNs, users can also use Confluence to collaborate on other research activities, such as group updates to inventories, preparation of talk/poster presentations, development of computer code, writing of manuscripts/reports, and project planning. Built-in authentication. Lehigh's LTS set up authentication and ensured that only Lehigh's users have access to Confluence. Using Confluence's permission setting, the ELN administrator can also manage use privileges of individual users. Therefore, by adjusting the permission setting, the administrator can make portions of ELN accessible to the public or make the entire ELN accessible only within the group. Lehigh's LTS backs up the data on a regular basis and protects the data against accidental loss. All the pages can be restored to any previous version and users can archive their data by exporting their pages into PDF files. Limitations: No direct integration with chemical drawing software. All files created with chemical drawing software need to be uploaded into Confluence as attachments and then embedded into notes. It will be much more convenient if users can directly create and edit ChemDraw images within Confluence. No editing of graphics is possible nor does a cut-and-paste capability for graphics exist. All non-text editing has to be accomplished by using other software and then the result uploaded as a new file attachment into Confluence. Therefore, although users no longer need print-cut-paste graphs to keep graphic notes, they still need cut-and-paste functionality. Not suitable for massive data storage. The current Confluence server at Lehigh has size limits for file attachments. Users have to store large data files (e.g., digital movies and images) on separate computers or servers. In this case, care must be taken to map the lab notes to the corresponding data files. Vezenov's group addressed this issue by establishing a separate Linux server on Lehigh's network and incorporating a date stamp into each filename. In this way, properly dated notes and files are unambiguously linked. In sum, Confluence or similar wiki software has both advantages and disadvantages that need to be evaluated against needs and common lab practices of individual research groups. Some research groups may find more suitable ELN solutions than Confluence. Librarian Involvement in Supporting Use of Wiki-Based Management of Research Group Activities Working with computing consultants, librarians can help implement the use of Confluence or similar software by research groups. Librarians working in a merged library/computing services organization (such as Lehigh's) can find it easier to meet these goals, even if computing/library cooperation is not impossible in other organizational frameworks. First, librarians can work with computing consultants to promote use of wiki-based software for the purposes described above. Faculty development programs can also get involved. Under the auspices of Lehigh's faculty development program, Vezenov discussed his experience on use of Confluence for ELN in a well-attended, video-taped session for faculty, staff, and graduate students. The presentation's popularity proved that many faculty members at Lehigh are unaware of the existing option to document their work electronically with Confluence, but are looking for ways to switch from PLN to ELN. Two faculty members who attended subsequently followed up to express their interest in using Confluence. A member of Lehigh's web and mobile services team met with the faculty members to discuss how to design and implement a Confluence web page suitable for their specific needs. Second, librarians can work with computing consultants to assist researchers with data management plans (for example, to meet NSF requirements) to help educate researchers about storage and access issues relating to ELNs. ELNs can be mentioned in these plans as the initial place for data storage, pending analysis of the data and possible open distribution of it. Third, librarians can identify library-related resources (e.g., electronic databases and handbooks) that a researcher can link from a Confluence web page. Finally, laboratory notebooks have great research value in documenting data collection procedures that lead to published papers. Given their important role in perpetuating the intellectual record, librarians should actively promote sustainable ways to ensure future access to laboratory notebooks. To the extent researchers want to make their data publicly available, or are required to do so, involvement by librarians in assisting researchers to accomplish these goals is a natural outgrowth of their ongoing interest in open access to information. Acknowledgements We offer our sincere thanks to: Christine Roysdon, Director for Collections & Scholarly Communication, for comments about an earlier draft; Drupal Web Develop Specialist Colin Foley at the LTS Web and Mobile Services for providing technical support and instruction in Confluence use; and Senior Computing Consultant Daniel Brashler at the LTS Client Services for providing insights about use of ELNs in chemistry research. References Bird C.L., Willoughby C., Coles S.J., and Frey J.G. 2013a. Data curation issues in the chemical sciences. Information Standards Quarterly 25(3): 4-12. DOI:10.3789/isqv25no3.2013.02. Bird C.L., Willoughby C., and Frey J.G. 2013b. Laboratory notebooks in the digital era: the role of ELNs in record keeping for chemistry and other sciences. Chemical Society Reviews 42: 8157-8175. DOI: 10.1039/C3CS60122F. Briney, K. 2012. Lab notebooks as data management. SLA Winter Virtual Conference 2012. Available from: http://www.slideshare.net/kbriney/lab-notebooks-sla-talk DoIT, University of Wisconsin-Madison. 2013. Moving towards using Electronic Lab Notebooks. News. [Internet]. [Accessed March 4, 2014]. Available from: https://www.doit.wisc.edu/news/moving-towards-using-electronic-lab-notebooks/ Drake D.J. 2007. ELN implementation challenges. Drug Discovery Today 12(15-16): 647-649. Emerging Technologies for Researchers. 2011. MediaWiki as an electronic lab book. Blog. [Internet]. [Accessed March 4, 2014]. Available from: http://blogs.unimelb.edu.au/researchservices/2011/11/23/mediawiki-as-electonic-lab-book/ Giles, J. 2012. Going paperless: The digital lab. Nature 481(7382): 430-431. DOI:10.1038/481430a. Hayes, M. 2012. Electronic Lab Notebooks. Blog. [Internet]. [Accessed March 4, 2014]. Available from: http://postdocexperience.scienceblog.com/2012/11/19/electronic-lab-notebooks/ Machina, H.K. & Wild, D.J. 2013. Electronic laboratory notebooks progress and challenges in implementation. Journal of Laboratory Automation 18(4): 264-268. DOI: 10.1177/2211068213484471. Milsted, A.J., Hale, J.R., Frey, J.G., and Neylon, C. 2013. LabTrove: a lightweight, web based, laboratory "blog" as a route towards a marked up record of work in a bioscience research laboratory. PLOS ONE 8(7): e67460. Available from: http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0067460#pone-0067460-g008 Myers, J.D., Mendoza, E.S., and Hoopes, B. 2001. A collaborative electronic laboratory notebook. IMSA: 334-338. Nehme, A. and Scoffin, R.A. 2006. Electronic laboratory notebooks. In: Ekins, S., editor. Computer applications in pharmaceutical research and development. Wiley: Hoboken, NJ. p. 209-228. Richardson, K.A. 2009. Academic chemists' use of laboratory notebooks and other information management tools. Master thesis, University of North Carolina at Chapel Hill. Available from: https://cdr.lib.unc.edu/record/uuid:3c728073-5a19-42a1-a9da-d48f89991ade Rubacha, M., Rattan, A.K. and Hosselet, S.C. 2011. A review of electronic laboratory notebooks available in the market today. Journal of the Association for Laboratory Automation 16(1): 90-98. Sferdean, F.C. & Wang, T. 2013. Catching up to the speed of data: insights from CLIR/DLF Data Curation Fellows on their journey towards developing data services. 2013 DLF Forum, Austin, TX. Available from: http://www.scribd.com/doc/180895630/Catching-Up-to-the-Speed-of-Data-Insights-from-CLIR-DLF-Data-Curation-Fellows-on-their-Journey-Towards-Developing-Data-Services-Poster Voegele, C., Bouchereau, B., Robinot, N., McKay, J., Damiecki, P., and Alteyrac, L. 2013. A universal open-source electronic laboratory notebook. Bioinformatics 29(13): 1710-1712. Walsh, E. & Cho, I. 2013. Using Evernote as an electronic lab notebook in a translational science laboratory. Journal of Laboratory Automation 18(3): 229-234. Wang, T. & Branch, D. 2013. Exploring Best Practices for Research Data Management in Earth Science through Collaborating with University Libraries. 2013 AGU Fall Meeting, San Francisco, CA. Wang, T. & Simboli, B. 2013. Assessing through interviews the data management behaviors and needs of an earth and environmental sciences academic department. Research Data Symposium 2013, Columbia University, New York, NY. Available from: http://hdl.handle.net/10022/AC:P:19181. Previous Contents Next This work is licensed under a Creative Commons Attribution 4.0 International License.