Guide to Creating Metadata for Your Dataset in IU DataCORE
Research Data Management - email@example.com
What is Metadata and why is it important?
Metadata refers to the high-level information that researchers provide when depositing data in IU DataCORE. This information includes the Title, Creator, Description, Methods, Keywords, and other descriptive information. Metadata helps people who may be interested in a dataset to find it and understand it well enough to decide whether or not they would like to download it from IU DataCORE.
Metadata is different from documentation, such as readme files, field notes or codebooks. Although they serve similar functions of providing important information about a dataset, metadata is generally more brief and succinct than documentation. Documentation on the other hand should provide more comprehensive information about the data, enough to enable others to reuse it for new projects or to reproduce the research to verify its findings.
What Metadata is asked for by IU DataCORE?
The following information provides more detailed instructions on completing the metadata fields in the “Deposit Your Work” form on IU DataCORE.
The term “Work” (see glossary) refers the main organizational unit on IU DataCORE; each dataset and its accompanying metadata are considered part of a Work.
Provide a descriptive name for the Work. Knowing that the Title will be discoverable on IU DataCORE and through other search engines, answer the question, “What would you like this dataset to be known as?” Think of the dataset as a discrete research product that has value in and of itself. An entry in this metadata field is required.
Consider the following guidelines:
- Mention the general area of research and/or the specific topic in the title (including the name of the research group or project, if the name is descriptive of the research).
- Consider using the terms “data” or “dataset” or referencing specific methods to help distinguish the data from other research products, like publications.
- Avoid the use of acronyms and jargon that will not be immediately recognizable to those within your field.
Creating a Title that incorporates important aspects of the research in accessible language is key to helping others discover the data. Multiple entries are possible for this field. See the note about multiple entries later in this guide.
- Big Ship Data: Pre- and Post-Processed Spatiotemporal Data for 2006-2014 for Great Lakes Air Temperature, Dew Point, Surface Water Temperature, and Wind Speed
- Literature search strategies for "Substance Use Education in Schools of Nursing: A Systematic Review of the Literature"
- S'Urachi Site-Based Archaeological Survey 2015
- Code and Results for "The Emergence of Groups and Inequality Through Co-Adaptation"
- The Observation and Simulation Dataset for The Response of the Coupled Magnetosphere-Ionosphere System to the 15 August 2015 Solar Wind Dynamic Pressure Enhancement
List the name(s) of the person(s) and/or organization(s) responsible for creating the Work. For names, please use the following format: Last name, First name MI. Providing a full list of those responsible for the dataset ensures that all parties receive proper credit for their work. Multiple entries are possible for this field. See the note about multiple entries later in this guide. An entry in this metadata field is required.
Enter the contact information (email address) for the individual who can best respond to questions about the Work. Contact information will be invaluable to other researchers if they need additional information or guidance when re-using the data or reproducing the research. Providing the information may also lead to opportunities for collaboration. An entry in this metadata field is required.
Explain the methods that were used to collect and process the data included in this Work. This could include:
- specific data-collection methods (e.g. content analysis, experiment, observation, simulation, survey, etc.) and details on how they were employed.
- tools used to collect, process, and/or analyze the data (i.e. names and versions of software, instruments, statistical tests, etc.)
Detailed information on methods will help users understand how the data relates to their own research and to trust their quality. This information may take a few sentences or a paragraph to explain adequately. Anything more than a paragraph is probably too long for the metadata however, and a full accounting of the methods used to collect, process and analyze the data can be saved for the documentation. An entry in this metadata field is required.
Though related to the “Methods” field, the information in the “Description” field should focus on what the data are, rather than how the data came to be.
In filling out the Description field, provide a general and brief description of the research that produced this data, including the researchers’ purpose or questions they wanted to answer. In addition, give an account of the files included in the Work, discussing their contents and the relationships between them. If specific software or scripts are required to access your data, this is a good place to document that as well. This may take a few sentences to explain fully. Information provided for this element will help users understand the contribution of this data to the discipline, as well as how the data files within the Work fit together to form a cohesive dataset. An entry in this metadata field is required.
If you would like to split your entry into multiple paragraphs, you may want to make use of the “+ Add another Description” option. Multiple entries are possible for this field. See the note about multiple entries later in this guide.
Select the date or the date range that correspond to the span of time that the data were collected and processed. If you want to indicate a date range, enter the beginning date in the boxes provided and then click on “+ Add date range” to generate boxes to enter an end date. Be as specific as you can in defining the beginning and end of the data production period. An entry in this metadata field is optional.
- 2015 - 2017
- 2015/10/31 - 2016/01/01
Creative Commons License
Select a licensing and distribution option that will govern use of the Work. Your choice will let users know if and under what conditions they can share and re-use the included data. There are three options to select from:
- CC0 - This license places your work in the public domain.
- CC-BY - “This license lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation.”
- CC-BY-NC - “This license lets others remix, tweak, and build upon your work non-commercially, and although their new works must also acknowledge you and be non-commercial, they don’t have to license their derivative works on the same terms.”
More information about Creative Commons licenses can be found on the Creative Commons organization’s website and they provide an interactive tool to help you decide which license may be right for you. An entry in this metadata field is required.
Select the disciplines(s) associated with the research for which the data were collected. This will aid in the discovery of your Work, as visitors to IU DataCORE are able to browse content by discipline.
To select multiple disciplines, hold Control on a PC or Command on a Mac and click on each option you want to select. They will remain highlighted, and you can proceed to the next element of the submission form. An entry in this metadata field is required.
Select the primary funding agency that supported the research project in which the data were generated or collected from the drop-down list. Identifying the funding agency can help demonstrate compliance with any data sharing requirements made by the agency and connect your research with a larger body of work. An entry in this metadata field is optional.
Enter any terms or topics that describe your work or would help people to find it in IU DataCORE. Quality keywords may include the following:
- Disciplines or sub-disciplines
- Research topics or areas
- Methods or tools central to the research
- Time periods and/or locations associated with the data
In addition to aiding discovery of the Work, entries for this element will help others identify the research as part of a specific field and understand the major themes that guided the activity. Multiple entries are possible in this field. Multiple entries are possible for this field. See the note about multiple entries later in this guide.
- Citation analysis
- Copper Age
- Gender and nationality bias
- Organic semiconductors
- Spatial measures
- Substance abuse
- University libraries
List the language(s) in which the data and supplementary content are written. These could be spoken languages (e.g. English, Spanish, Mandarin, Arabic, etc.) or programming languages (e.g. C++, MATLAB, Python, XML, etc.). This information will convey to readers what expertise or technical skills and software they will need to understand the data. Multiple entries are possible for this field. See the note about multiple entries later in this guide.
Citation to Related Material
Enter a citation to any publication(s) that make use of or reference the data in this Work. Most often these will be articles and books by the Creator(s) or the research group. If the research or methodology is strongly connected to previous studies (either by the same research group or another one), you might also provide a citation for that material.
Include a full citation where possible. Otherwise, a URL, DOI, or other unique identifier to the related item will suffice.
If the publication has not yet been released, please make note of that in the field and plan to update Research Data Services once it has been published. Multiple entries are possible for this field. See the note about multiple entries later in this guide.
- Li, Y., M. C. Barth, G. Chen, E. G. Patton, S.-W. Kim, A. Wisthaler, T. Mikoviny, A. Fried, R. Clark, and A. L. Steiner (2016), Large-eddy simulation of biogenic VOC chemistry during the DISCOVER-AQ 2011 campaign, J. Geophys. Res. Atmos., 121, doi:10.1002/2016JD024942.
- King, A. E. and Blesh, J. (2018), Crop rotations for increased soil carbon: perenniality as a guiding principle. Ecol Appl, 28: 249–261. doi:10.1002/eap.1648
- Engel, D.D., Evans, M.A., Low, B.S., Schaeffer, J. (2017) “Understanding Ecosystem Services Adoption by Natural Resource Managers and Research Ecologists.” Journal of Great Lakes Research, 43(3), 169-179. DOI:10.1016/j.jglr.2017.01.005
- Tawanna R Dillahunt, Xinyi Wang, Earnest Wheeler, Hao Fei Cheng, Brent Hecht, and Haiyi Zhu. 2018. The Sharing Economy in Computing: A Systematic Literature Review. In Proceedings of the 21st ACM conference on Computer supported cooperative work & social computing. ACM. to appear.
- Eby, D.W., Molnar, L.J., Kostyniuk, L.P., St. Louis, R.M., & Zanier, N. (2011). Recommendations for Meeting the Needs of Michigan’s Aging Population. Report No. RC-1562. Lansing, MI: Michigan Department of Transportation.
Note on multiple entries
Some of the metadata elements allow you to create multiple fields to separate terms or content in submissions. For example, if the dataset has more than one creator, they should each be listed in separate Creator fields.
To create another entry, type the first entry in the provided field for that metadata element, and then click on the link just below that reads “+ Add another [element name],” which will create another field. Repeat this process as needed.
Have Questions? Need Help?
Please contact Research Data Management - firstname.lastname@example.org