Textreadr

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search

Purpose

Reads ASCII flat files from MS Excel and other spreadsheets as a DataSet Object.

Synopsis

[out,usedoptions] = textreadr(file,delim,options)

Description

TEXTREADR reads tab, space, comma, semicolon or bar delimited files with names on the columns (variables) and rows (samples). Also handles Excel XLS files.

If TEXTREADR is called with no input, or an empty matrix for file name file, a dialog box allows the user to select a file to read from the hard disk.

Inputs

  • file = One of the following identifications of files to read:
a) a single string identifying the file to read
('example.txt')
b) a cell array of strings giving multiple files to read
({'example_a' 'example_b' 'example_c'})
c) an empty array indicating that the user should be prompted to locate the file(s) to read
([])
  • delim = An optional string used to specify the delimiter character.
Supported delimiters include:
  • 'tab' or '\t' or sprintf('\t')
  • 'space' or ' '
  • 'comma' or ','
  • 'semi' or ';'
  • 'bar' or '|'
If (delim) is omitted, the file will be searched for a delimiter common to all rows of the file and producing an equal number of columns in the result.

Outputs

  • out = A DataSet object with date, time, info (data from cell (1,1)) the variable names vars, sample names samps, and data matrix data. Note that the primary difference between this function and the Mathworks function xlsread is the parsing of labels and output of a dataset object.
  • usedoptions = the options structure that was actually used including modifications made during import (including when using the import tool to define columns/etc).

Options

Optional input options = a structure array with the following fields:

  • parsing: [ 'manual' | {'automatic'} | 'auto_strict' | 'stream' | 'graphical_selectin' | 'gui' ] determines the type of parsing to perform:
'automatic' : the file is automatically parsed for labels and header information. This works on many standard arrangements with different numbers of rows and column labels. May take some time to complete with larger files. See note below regarding additional options available with 'automatic' parsing.
'auto_strict' : faster automatic parsing which does not handle header lines, and expects that all row labels will be on the left-hand side of the data and all column labels will be on the top of the columns. If this returns the wrong result, try 'automatic'.
'manual' : the options below are used to determine the number of labels and header information.
'stream' : nearly identical to 'automatic' but reads from the file in pieces. This allows reading somewhat larger files than might otherwise be readable because of memory limitations.
'importtool' : Show importtool during parsemixed to manually designate data,label,class,... columns and rows.
'graphical_selection' : same as 'importtool'
'gui' : allows selection of standard options using a GUI.
Note that when the file type is XLS, 'automatic' parsing is always performed.
  • commentcharacter: [''] any line that starts with the given character will be considered a comment and parsed into the "comment" field of the DataSet object. Deafult is no comment character. Example: '%' uses % as a comment character. NOTE: NOT used with 'auto_strict' parsing.
  • headerrows : [{0}] number of header rows to expect in the file. NOTE: NOT used with 'auto_strict' parsing.
  • catdim : [{0}] specifies the dimension that multiple text files should be joined on. 1 = rows, 2 = columns, 3 = slabs, 0 = automatically select based on sizes. Automatic mode joins in rows or columns if the other mode doesn't match in size, in the 3rd mode if BOTH dimensions match, and throws an error if no sizes match.
  • autopermute : [ false | {true} ] When true, multiple files joined in the 3rd dimension are also permuted so that multiple files form the ROWS of the output and the original rows and columns are moved into columns and slabs. This option is most often used for multiway data where each file is a separate sample and, thus, should be separate rows of the output.
  • waitbar : [ 'off' |{'on'}] Governs use of waitbars to show progress

Two additional options are used ONLY with 'manual' parsing. See parsemixed for similar options to use with 'automatic' parsing.

  • rowlabels : [{1}] number of row labels to expect in the file
  • collabels : [{1}] number of column labels to expect in the file
  • autobuild3d : [{false}| true ] Governs automatic joining of equal sized files as slabs in a 3-way array. When true, multiple files which contain data that match in both rows and columns will be combined as separate slabs of a 3-way array. Often used with the autopermute function.
  • autopermute : [ false | {true} ] When true, multiple files joined in the 3rd dimension are also permuted so that multiple files form the ROWS of the output and the original rows and columns are moved into columns and slabs. This option is most often used for multiway data where each file is a separate sample and, thus, should be separate rows of the output.


The default options can be retrieved using: options = textreadr('options');

In addition to the above options, if option parsing is set to 'automatic', any option used by the parsemixed function can be input to TEXTREADR. These options will be passed directly to parsemixed for use in parsing the file. See parsemixed for details.

Examples

Reading a file (in this case an XLS file) which has a single row of axis scale information at the top (first row) of the table and a single column of axis scale information at the left (first column) of the table can be done using this code:

  opts = textreadr('options');
  opts.axisscalecols = [1];
  opts.axisscalerows = [1];
  data = textreadr('myfile.xls',opts);

Importing Data with importtool

See Also

areadr, dataset, parsemixed, spcreadr, xclgetdata, xclputdata, xlsreadr, importtool