The NeXus format has combined the ideas of different proposals originating in the United States and Europe. It has been developed at four workshops, SoftNeSS'94, '95 and '96 and NeXus'2001. This section summarizes the motivation for developing the format in more detail, as well as providing some historical background to its development. The details of the format itself are discussed in subsequent sections.
Until recently, the experimental research community did not have any motivation to standardize the form in which data are stored on computers. Scientists had self-contained, well-defined sets of programs to process and display the data. However, it is now more common for researchers both to perform experiments at a number of different institutions, and to attempt novel ways of data reduction. When they do, they quickly encounter the difficulties that arise from the lack of an established standard for exchanging the data. We discuss below some of the potential advantages of such a standard format.
Before developing data analysis or visualization software, the programmer must learn how to obtain meaningful data from the existing data files. This can sometimes require a considerable investment in time and effort in order to acquire an adequate understanding of local formats. Small but important details, e.g. the sign convention used in angle offsets or the value of electronic time delays, are not always well documented but are vital to interpreting the data. A common data format will obviate the need to obtain this expertise at all the facilities that they visit.
If researchers are to attempt their own analysis, it is necessary to write format conversion tools. Writing individual tools is not a serious problem, but with the growing number of research groups performing scattering experiments, we are faced with a combinatorial explosion of the number of converters. A common data format reduces the number of converters from n × n to 2n. If all newly-written programs would read and write the common format, the need for data conversion would disappear altogether.
New software is being developed all the time at different facilities. However, the need for format conversion is normally a serious barrier to implementing these programs away from the institute in which they were developed. One result is that software with similar functionality is reproduced many times over.
We believe that a common format will lead to greater cooperation in software development both within and between the neutron and X-ray synchrotron communities. Many of the techniques of data manipulation and display are common to both communities, but there has been very little interaction between them. The ability to use common utilities to browse and plot neutron and X-ray data should make it easier to share software solutions to common problems.
There is an urgent need to develop more sophisticated visualization tools with the growth in the number of multidetector elements on new instruments. New scattering techniques (e.g. single crystal inelastic scattering on time-of-flight spectrometers) require novel ways of viewing the data. However, programmers are deterred from investing the necessary resources to produce these tools because their results are usually limited to single institutions, and in some cases, single instruments. It is also difficult to build on developments elsewhere.
Increased standardization can lead to increased functionality. It is now possible to 'drag and drop' word processor files onto printer icons because both word processors and printers commonly understand the Postscript language. Similarly, it will be possible to develop more automated means of reading and viewing experimental data if that data is stored in a standard form. By basing the standard on a public domain format, such as HDF, we can immediately take advantage of the tools already developed for those formats.
In designing the NeXus format, the team has been guided by a number of general principles. These are intended to ensure that the format meets the needs of as wide a cross section of the neutron and X-ray scattering communities as possible. Fortunately, the computing technology available to us now allows us to be very flexible in the organization and comprehensiveness of the format without rendering the standard useless.
It is up to those implementing the standard at any institution to decide how much information to include in the NeXus files. However, the standard defines how such information should be stored if it is included. This allows analysis programs to search for the information, and prompt for it if it is absent. There could be substantial benefits from including more detailed instrumental information, and users will be encouraged to do so. However, in many cases, the files will be produced by automatic translation utilities from existing data files, so we cannot require more detail than currently stored at any facility.
Although some NeXus files will be extremely simple, perhaps containing a single data set, the standard should be flexible enough to be able to describe extremely complex instrumentation comprising many different components. At some facilities, the standard will be used to archive the raw data and all related parameters so there has to be a method of storing this information in an easily assimilated fashion.
It is our intention that the format be defined for generic forms of neutron and X-ray instrumentation e.g. triple-axis spectrometers, pulsed neutron powder diffractometers etc. This will show developers of analysis software what information should be available in typical data files, and how to access it.
One aim of the standard is to facilitate automatic plotting and analysis as far as possible. This means that generic plotting software should be able to identify easily what parts of the file contain plottable data. These data should include independent axis scales, labeling, units and titles.
Modern data formats no longer require the programmer to learn the details of the physical layout of the file in order to read or write data. This is because the interaction with the data file is through a program interface, commonly known as the Application Program Interface, or API. If the format is self-describing, the progammer only needs to know the keywords used to identify specific data sets, and perhaps the logical framework in which the data is stored.
In our view, the neutron and X-ray community does not have the resources to develop and maintain a completely new data format and its associated interface software. An early decision therefore was to utilize one of the many existing formats as the basis of the NeXus format. In our selection, we used the following criteria:
The following data formats are widely used, well supported and available in the public domain. All were considered as possible formats on which to base NeXus, but rejected for the reasons given. The comments are not meant to imply a criticism of the design or functionality of these formats, all of which have served their user communities well, but rather to give a justification for our eventual selection.
The Hierarchical Data Format has been developed at the National Center for Supercomputing Applications (University of Illinois, Urbana Champaign). Like netCDF, it is binary, self-describing and extensible, and has been ported to a wide range of computer platforms, including PC and Macintosh, workstations and minicomputers, mainframes and massively parallel environments. It is actively maintained and used by an increasing number of organizations and scientific communities. Furthermore, many data visualization packages, such as IDL, MatLab, PV-wave, AVS and Data Explorer can access HDF files. It is even possible to view HDF files using web browsers, although this is not yet available for all platforms.
We have chosen HDF over netCDF because of the greater flexibility it gives us in organizing the instrument descriptions to be contained in NeXus files. This is because it is possible to organize data sets into a hierarchy of data ensembles. Many of the most attractive features of netCDF have been incorporated into recent versions of HDF (after v3.3r3) removing any obstacles to its adoption as the NeXus format.
Three parallel developments have led to the idea of a common data format for neutron and X-ray scattering data.
This formed the basis for the current design of the NeXus standard which has been developed at two workshops, SoftNeSS'95 (NIST Sept. 1995) and SoftNeSS'96 (Argonne Oct. 1996). A brief description was published in the proceedings of the International Conference on Neutron Scattering, Toronto, 1997 (Ref. 4)
Several steps have been taken to obtain the support of the neutron and X-ray scattering communities:
The report of the original SoftNeSS workshop, SoftNeSS'94, was sent to over one hundred neutron and X-ray scatterers around the world inviting their comments and involvement in the format development.
One result is that many scattering facilities sent representatives to the next workshops, including Brookhaven, Chalk River, ILL, IPNS, ISIS, LANSCE, NIST and PSI. In addition, we obtained the support of many other facilities.
John White, the president of the neutron commission of the IUCr, has proposed that we seek the formal approval of the IUCr. Informal contacts have been made prior to a formal approach.
In January 1996, the proposal was presented by Ray Osborn to the Workshop on New Opportunities for Better User Group Software (NOBUGS), jointly organized by the ILL and ESRF, and attended by both X-ray and neutron scattering scientists and software developers from around the world.
In August 1996, the proposal was presented by Przemek Klosowski to the Neutron Scattering Satellite Meeting of the IUCr, NIST, Maryland.
A symposium was held during ICNS'97 in Toronto to present the NeXus format to the neutron scattering community. Ray Osborn described the design philosophy and Przemek Klosowski gave a technical presentation followed by demonstration of the Tcl/Tk interface that he and Nick Maliszewskyj have developed.
Przemek Klosowski attended the imgCIF Workshop held at Brookhaven National Laboratory on October 20 and 21, 1997, to discuss relations between the NeXus and imgCIF formats (See Note).
In December 1997, Mark Koennecke and Przemek Klosowski gave a tutorial session on using NeXus to the second Workshop on New Opportunities for Better User Group Software (NOBUGS), jointly organized by the APS and IPNS. As with the first workshop, it was attended by both X-ray and neutron scattering scientists and software developers from around the world.
Mark Koennecke and Przemek Klosowski represented the NeXus format at two canSAS workshops, which aimed to define interchange standards for neutron and x-ray small angle scattering.
In September 1999, Mark Koennecke attended a round-table discussion on data formats in muon spin relaxation. There was some interest in using NeXus, particularly since muon facilities at PSI and ISIS can cooperate with their neutron scattering neighbours. The minutes of their discussion are available on the web.
In August 1999 and January 2000, Ray Osborn gave introductions to the NeXus format to instrument scientists at ISIS and IPNS/SNS respectively. The talks included online demonstrations of the portability of the NeXus files (from VMS to Windows NT to Linux to Macs) and the ability to browse or display NeXus data with third-party HDF utilities, both freeware and commercial.
At the recent NOBUGS III conference, several members of the NeXus design team made presentations of their work on NeXus. Mark Koennecke described a java web-based data server and browser, Przemek Klosowski discussed efforts at NIST to build a Tcl/Tk-based data explorer, and Chris Moreton-Smith presented a proposal to formalize the NeXus format in XML. There was also discussion of data formats in a workshop subgroup.
The first completely open NeXus workshop was held at the Paul Scherrer Institut, Switzerland, on March 20-21, 2001. It was attended by thirty-five scientists and software developers representing fourteen different neutron, x-ray, and muon facilities, and user institutions.
A breakout session was held at the first American Conference on Neutron Scattering, attended by representatives from Argonne, Los Alamos, SNS, ISIS, and ANSTO.
NeXus is still in the early stages of its development. We consider that it is now ready for evaluation by any interested members of the neutron and X-ray scattering communities. We therefore invite them to subscribe to the NeXus Mailing List to receive news updates and to send feedback. A critical component of the NeXus format is the Application Program Interface for reading and writing NeXus files. Mark Koennecke, Przemek Klosowski, and Freddie Akeroyd have produced the first version of the API, along with more detailed documentation for the interested programmer, which is available for downloading. Please see the API section for more details.
We intend to monitor the support of NeXus in the neutron and X-ray communities. If your home institute intends to support the format in any way, or if any of the following information is inaccurate, please let me know so that this record is kept up-to-date.
Nearly all the people involved in the development of the NeXus format have come from the neutron scattering community, with one notable exception (see below). However, it has been our goal that the design will also serve the needs of the X-ray synchrotron community as well, and that both can benefit from increased collaboration in the development of data analysis and visualization techniques. Here, we present the efforts that we have undertaken to fulfil that goal.
Although most of those involved in the NeXus project are neutron scatterers, the NeXus format is largely based on a proposal by Jon Tischler (ORNL/APS) for a standard APS format. Jon has attended all the SoftNeSS workshops to advise on technical issues concerning the use of HDF and to ensure that the modifications that we have made do not adversely affect NeXus' use by X-ray synchrotron sources.
As a result of Jon's advocacy, NeXus is the recommended data format for use by the APS Collaborative Access Teams. It has been discussed on several occasions by the APS Beamline Controls Subcommittee, who have approved this advice, although it is the decision of each CAT whether to adopt NeXus as their local format.
The ESRF has agreed to provide utilities to translate their own data into the NeXus format (Bill Pulford, private communication).
In January 1996, Ray Osborn presented the NeXus proposal to the Workshop on New Opportunities for Better User Group Software (NOBUGS), jointly organized by the ILL and ESRF, and attended by both X-ray and neutron scattering scientists and software developers from around the world.
Note: In a round table discussion at the workshop, we discussed the merits of the NeXus proposal compared to a proposal to extend CIF by including binary images (imgCIF/CBF). Those involved in the NeXus proposal did not think that imgCIF would have sufficient flexibility to describe complex neutron instrumentation. Those involved in the imgCIF proposal preferred the CIF use of ASCII header information and were concerned about HDF's ability to handle the high data rates at synchrotron sources. We have decided not to try and merge the two proposals at this stage, particularly since, in our view, they are serving different purposes.
In December 1997, Mark Koennecke and Przemek Klosowski gave a tutorial session on using NeXus to the second Workshop on New Opportunities for Better User Group Software (NOBUGS), jointly organized by the APS and IPNS. As with the first workshop, it was attended by both X-ray and neutron scattering scientists and software developers from around the world.
The neutron community is attempting to define a set of generic neutron scattering instruments so that the data will be stored consistently from one site to the next, in order to simplify the task of developing comprehensive data analysis software. No such effort is yet underway in the X-ray community but we will be happy to assist anyone who believes that it would be worthwhile. Nevertheless, even without such standardization, NeXus users will benefit from the development of general data visualization applications and browser utilities, all of which should apply to any NeXus file whatever its origin.
Comments to: Ray Osborn <ROsborn@anl.gov>
Revised: Saturday, September 14, 2002
Copyright © 1996-2002, NeXus Design Team. All rights reserved.