Accessing and Paginating CSV Files with DataValve

While DataValve is mostly used with database driven back ends, this tutorial shows you how DataValve can turn a comma delimited file into a paginated list of objects that the user can page through. We will then use this data provider in a console application, a Swing application and a JSF web page using the DataValve data client interfaces.

We will start by writing this tutorial as a console application and then demonstrate how the data provider can be used in other applications, even web applications. We’ll start by creating a new Maven application and then include the dependencies for DataValve. We’ll then create our comma delimited data provider, define a row mapper class for it, and hook it up to our demo data. We’ll then attach it to a dataset so we can easily iterate through the data. We will then take our provider and use it in different clients.

  1. Create a new Maven application in your IDE and add datavalve-dataset as a dependency in your pom.xml file.
    	<dependencies>
    		<dependency>
    			<groupId>org.fluttercode.datavalve</groupId>
    			<artifactId>datavalve-dataset</artifactId>
    			<version>0.9.0.CR2</version>
    		</dependency>
    	</dependencies>
    
  2. Create a new class called Person in the package of your choice. I used org.fluttercode.tutorials.datavalve.csvreader We will only be using one package in this demonstration.
  3. The Person class will be our model which will be populated from the comma delimited file.
    public class Person {
    
    	private Integer id;
    	private String firstName;
    	private String middleName;
    	private String lastName;
    	private Date dob;
    	private String email;
    	private String address;
    	private String city;
    	private String zip;
    
    	public Person() {
    	}
    
    	public Person(Integer id, String firstName, String lastName,String middleName,
    			Date dob, String email, String address,
    			String city, String zip) {
    		super();
    		this.id = id;
    		this.firstName = firstName;
    		this.middleName = middleName;
    		this.lastName = lastName;
    		this.email = email;
    		this.dob = dob;
    		this.address = address;
    		this.city = city;
    		this.zip = zip;
    	}
    
    	public String getName() {
    		return firstName + " " + lastName;
    	}
    
    	public String getAddressText() {
    		return address+ "," + city+","+zip;
    	}
    
    	@Override
    	public String toString() {
    		return String.format("Person [id=%d, name=%s, dob=%s,address=%s, email=%s",
    		    id,getName(),dob,getAddressText(),email);
    	}
    
        .... Getters and Setters Omitted ....
    }
    
  4. In order to convert the csv data to an object, we need to create a ColumnarRowMapper instance. This takes a row of data, already converted to an array of string values, builds the model object from it and returns it back to the caller. Create a new class called PersonRowMapper.
  5. public class PersonRowMapper implements ColumnarRowMapper<Person> {
    
        private static SimpleDateFormat converter = new SimpleDateFormat(
    			"MM/dd/yyyy");
    
    	public Person mapRow(String[] values) {
    		Date dob = null;
    
    		try {
    			dob = converter.parse(values[5]);
    		} catch (ParseException e) {
    		}
    
    		return new Person(DataConverter.getInteger(values[0]), values[1],
    				values[2], values[3], dob, values[4], values[6], values[7],
    				values[8]);
    	}
    }
    

    This row mapper implements the ColumnarRowMapper interface and implements the single method to return a built entity object from the array of column values passed in.

  6. The only other element we need is some test data to run it on which you can download from here. Download this file and place it in the package folder with your source code. Depending on your IDE, you may need to refresh the folder in the package explorer so it can pick up the new file in the folder. The example file has 100 rows of data in it.
  7. Create a new class called ProviderFactory which will create our provider and initialize it with the CSV file url.
    public class ProviderFactory {
    	
    	public static CommaDelimitedProvider<Person> createProvider() {
    		URL url = ProviderFactory.class.getResource("data.csv");
    		CommaDelimitedProvider<Person> provider;
    		provider = new CommaDelimitedProvider<Person>(url.getFile());
    		provider.setRowMapper(new PersonRowMapper());
    		return provider;
    	}
    }
    

    We use a URL to reference the csv file stored in the jar from which we get a File instance that we can pass to our CommaDelimitedProvider. This relies on the data.csv file being in the same location as this class. Alternatively, you can put the file elsewhere and just create a regular File object to it rather than use the URL.

  8. For the console part of this demo, create a Main class and add a main method containing the code get an instance of the provider, and perform a simple iteration through the data.
    public class Main {
    
    	public static void main(String[] args) {
    		CommaDelimitedProvider<Person> provider;
    		
    		provider= ProviderFactory.createProvider();
    
    		List<Person> results = provider.fetchResults(new DefaultPaginator());
    		for (Person p : results) {
    			System.out.println(p);
    		}
    	}
    }
    

    In this example we are reading all the results at once in memory and displaying the whole list.

Paginating the results

We can control which results and how many are displayed by using a paginator to control the flow of data. This can be useful if you have a huge file and it would be impractical to read all the results into memory in one go. Datasets have paginators built in and use a reference to a provider to fetch the actual data.

  1. In the main method, instead of using the provider directly, create a dataset and pass it a reference to our data provider.
    public static void main(String[] args) {
    	CommaDelimitedProvider<Person> provider;
    
    	provider = ProviderFactory.createProvider();
    
    	CommaDelimitedDataset<Person> ds = new CommaDelimitedDataset<Person>(provider);
    	ds.setMaxRows(10);
    	List<Person> results = ds.getResultList();
    	for (Person p : results) {
    		System.out.println(p);
    	}
    }
    

    We have set the maximum number of rows to 10 so the results returned only contains 10 rows of data at most. We can use this mechanism to control the number of results fetched if they are paginated.

  2. The dataset classes implement the Iterable interface which lets us iterate over the entire set of data. By setting the page size, we can control the batch sizes when the data is fetched in while it is being iterated over. This allows us to control the flow of data, either to reduce the in-memory realization of the source data, or to reduce latency in an otherwise slow data source as the application can process one page of data while the provider is loading the next page of data.
    public static void main(String[] args)  {
    	CommaDelimitedProvider<Person> provider;
    	provider = ProviderFactory.createProvider();
    
    	CommaDelimitedDataset<Person> ds = new CommaDelimitedDataset<Person>(provider);
    	ds.setMaxRows(5);
    	for (Person p : ds) {
    		System.out.println(p);
    	}
    }
    
  3. You can see this process in action by extending the comma delimited data provider as an anonymous class and adding logging to the post fetch methods. We do this by modifying the ProviderFactory.createProvider() method :
    public static CommaDelimitedProvider<Person> createProvider() {
    	URL url = Main.class.getResource("data.csv");
    	CommaDelimitedProvider<Person> provider = new CommaDelimitedProvider<Person>(url.getFile()) {
    		@Override
    		protected List<Person> doPostFetchResults(List<Person> results,
    				Paginator paginator) {
    			int end = paginator.getFirstResult()+paginator.getMaxRows();
    			System.out.println("Fetching results from "+paginator.getFirstResult()+" to "+end);
    			return results;
    		}
    	};
    	provider.setRowMapper(new PersonRowMapper());
    	return provider;
    }
    
  4. If you run this now, with the max rows set to 5, in you will see in that log that we iterate through all the records, but we only fetch the results every 5 rows since the max results is set to 5.

    Fetching results from 0 to 5
    Person [id=6825, name=JUANITA LAMBERT, address=139 MANNING HWY,CLYO,76604, email=mbeasley@everyma1l.biz
    Person [id=5740, name=GREG CABRERA, address=736 GENESSEE BLVD,CORDELE,17433, email=cholder@b1zmail.biz
    Person [id=8599, name=ALISSA WISE, address=205 ALICE RD,CAMILLA,14855, email=theyweb@eyec0de.net
    Person [id=9282, name=SHARON WINTERS, address=955 COHEN PIKE,TYRONE,811, email=jlogan@hotma1l.com
    Person [id=2150, name=KRISTY FRANKS, address=1471 ALEXIS PKWY,BALDWIN,85, email=jgates3@somema1l.com
    Fetching results from 5 to 10
    Person [id=9927, name=JEFF RICE, address=104 DUNDEE PKWY,HOGANSVILLE,3741, email=diedlots@b1zmail.org
    Person [id=7972, name=TAMARA BRYANT, address=1382 WOGAN BLVD,CITY OF CALHOUN,43790, email=hotworn@everyma1l.us
    Person [id=5824, name=ALISHA YANG, address=716 HOGANS DR,HARDING,58932, email=foundwrong@hotma1l.net
    Person [id=3402, name=JASON NGUYEN, address=527 MICHAEL CRES,FORT STEWART,14664, email=haveothers@ma1l2u.com
    Person [id=3620, name=LINDSEY CABRERA, address=1420 LAZELERE HTS,FORT STEWART,21650, email=askeddreams@b1zmail.com
    Fetching results from 10 to 15
    Person [id=3511, name=ANTHONY MATHEWS, address=1325 OKEY LN,THUNDERBOLT,63656, email=cfarrell@everyma1l.co.uk
    Person [id=572, name=JARED FORD, address=722 EUCLID RD,FORSYTH,42014, email=ortrying@b1zmail.co.uk
    Person [id=9720, name=AUTUMN WILLIAMS, address=260 HILL PARK,NORCROSS,58355, email=roomwhere@hotma1l.net
    Person [id=3447, name=MARION BROWN, address=739 MIDDLE PATH,MACON,9944, email=hwagner@b1zmail.co.uk
    Person [id=9356, name=HOPE HAYNES, address=1023 COOPERRIDERS CRES,STATENVILLE,34247, email=sacrificeit@eyec0de.com
    Fetching results from 15 to 20
    Person [id=2259, name=JEANNIE RANDOLPH, address=672 EDISON PATH,CENTERVILLE,48580, email=wornto@eyec0de.net
    Person [id=8264, name=RACHAEL CONLEY, address=1223 STEVENS CT,CARROLLTON,40084, email=smokewhite20@eyec0de.net
    

Creating a Swing Client

DataValve provides an interface for data access that can be used by different clients. Lets look at using our provider with a Swing JTable.

  1. Start by creating a new class that will be the Swing frame that contains the table and the scroll pane.
    public class CsvTableFrame extends JFrame {
    	
    	private JTable table;
    	private JScrollPane pane;
    
    	public CsvTableFrame(CommaDelimitedProvider<Person> provider) {
    		initControls();
    		initModel(provider);
    	}
    	
            //here we will create the table model and attach the provider
    	private void initModel(CommaDelimitedProvider<Person> provider) {
    	}
    
            //construct the gui table and scrollable panel
     	private void initControls() {
    		setTitle("CSV Data");
    		setSize(400, 400);
    		setDefaultCloseOperation(EXIT_ON_CLOSE);
    		setVisible(true);
    
    		table = new JTable();
    		pane = new JScrollPane(table);
    		table.setAutoResizeMode(JTable.AUTO_RESIZE_OFF);
    		getContentPane().add(pane);
    	}
    }
    

    This will setup the display for showing a frame with a scrollable table in it.

  2. Now we need to implement the initModel method which will create a ProviderTableModel and attach the provider passed in. The table model class enables us to present our data to the Swing table in a way it understands. The ProviderTableModel implements a method which supplies column values to the model from the data supplied by the provider. The columns are defined in the latter half of the method and determines the order used to supply data to the table for each column.
            //here we will create the table model and attach the provider
    	private void initModel(CommaDelimitedProvider<Person> provider) {
    		ProviderTableModel<Person> model = new ProviderTableModel<Person>(
    				provider) {
    
    			@Override
    			protected Object getColumnValue(Person person, int column) {
    				switch (column) {
    				case 0:
    					return person.getId();
    				case 1:
    					return person.getName();
    				case 2:
    					return person.getEmail();
    				case 3:
    					return person.getAddressText();
    				default:
    					throw new RuntimeException(
    							"Unexpected column for person object " + column);
    				}
    			}
    		};
    
    		//add the columns to the model
    		model.addColumn("Id");
    		model.addColumn("Name");
    		model.addColumn("Email");
    		model.addColumn("Address");
    		
    		//assign this model to the table
    		table.setModel(model);
    	}
    
  3. The last piece we need to change is in the main method where we will create our provider and then pass it into the creation of our Swing frame.
    public static void main(String[] args) {
    		
    	CommaDelimitedProvider<Person> provider;
    
    	provider = ProviderFactory.createProvider();
    		
    	new CsvTableFrame(provider);
    
    }
    

    Creating the CsvTableFrame initializes and shows the Swing window.

  4. If we run this now, we will get a Swing Window which contains a table with our data in it.
    CSV Swing Table DataValve

The model controls the flow of the data and even includes built-in caching and look-ahead loading so no matter how big your CSV dataset is, there is no long delay while the data is loaded and converted to Java objects. In this case, we pass only the provider to the model and the model is responsible for how much data is fetched in each batch.

If you look in the log as you scroll down the list, you will see that it is loading in the data as you scroll. If you go to the end of the list, and start slowly scrolling back up, you won’t see any more loading messages until you get to the top of the list. This is because the values are cached, but if you have a large dataset, the least recently used items (i.e. those at the top of the table) are removed from the cache and therefore need re-loading when you go back to the start of the list.

Creating a JSF Client

The DataValve API allows us to re-use providers with many different clients which we’ll demonstrate with JSF.

  1. Start by creating a new JSF web application, or use the Knappsack Archetypes for Maven to get started quickly. Use the basic archetype so there is no existing application in there or alternative Person objects to clash with.
  2. Add the datavalve-dataset API and the datavalve-faces dependencies to the project,
    		<dependency>
    			<groupId>org.fluttercode.datavalve</groupId>
    			<artifactId>datavalve-dataset</artifactId>
    			<version>0.9.0.CR2</version>
    		</dependency>
    		<dependency>
    			<groupId>org.fluttercode.datavalve</groupId>
    			<artifactId>datavalve-faces</artifactId>
    			<version>0.9.0.CR2</version>
    		</dependency>
    
    
  3. For convenience, copy the ProviderFactory.java,Person.java and PersonRowMapper.java classes over to the new project.
  4. Create a new class called that will be our backing bean that will hold the dataset the JSF page will go against.
    @Named("csvDataset")
    @RequestScoped
    public class CsvDatasetBean {
    	
    	CommaDelimitedDataset<Person> dataset = 
    	     new CommaDelimitedDataset<Person>(ProviderFactory.createProvider());
    	
    	public CsvDatasetBean() {
    		//initialize the dataset settings  
    		dataset.setMaxRows(10);
    	}
    		
    	public CommaDelimitedDataset<Person> getDataset() {
    		return dataset;
    	}	
    }
    
  5. Now we have created the backing bean pieces, open home.xhtml to edit it, and replace the hello world text with the following :
    <?xml version="1.0" encoding="UTF-8"?>
    <ui:composition xmlns="http://www.w3.org/1999/xhtml"
    	xmlns:ui="http://java.sun.com/jsf/facelets"
    	xmlns:f="http://java.sun.com/jsf/core"
    	xmlns:h="http://java.sun.com/jsf/html"
    	xmlns:dv="http://java.sun.com/jsf/composite/datavalve"	
    	template="/WEB-INF/templates/template.xhtml">
    	<ui:define name="content">
    		<h:dataTable value="#{csvDataset.dataset.resultList}" var="v_person">
    			<h:column>
    				<f:facet name="header">ID</f:facet>
    				<h:outputText value="#{v_person.id}" />
    			</h:column>
    
    			<h:column>
    				<f:facet name="header">Name</f:facet>
    				<h:outputText value="#{v_person.name}" />
    			</h:column>
    
    			<h:column>
    				<f:facet name="header">Email</f:facet>
    				<h:outputText value="#{v_person.email}" />
    			</h:column>
    
    		</h:dataTable>
    		<h:form>
    			<dv:simplePaginator paginator="#{csvDataset.dataset}" />
    		</h:form>
    	</ui:define>
    </ui:composition>
    

    This page contains a table that takes the results from #{csvDataset.dataset.resultList} and displays the id, name and email fields. The last item wrapped in a form is the datavalve-faces default paginator which allows to you scroll across the data. This is provided as part of datavalve-faces and the namespace is added at the top of the page. With JSF 2.0 including support for AJAX, you can have AJAX enabled pagination by setting the attributes on the component.

    CSV JSF Table DataValve

Summary

This tutorial has shown how to consume csv files in a way that is re-usable across different client applications using DataValve. Alternatively, if you change the implementation of ProviderFactory.createProvider to return a different type of provider (i.e. JDBC, ORM, Hibernate) as long is it returns the same kind of Person object, your code will run unchanged in the clients you create. Given the simplicity of the DataValve interfaces, it is not hard to see how easy it is to create providers for other file types whether they be text or binary based.

Comments are closed.