Comparing Mashup Platforms Using JSON & MySQL – Part 2 – Process JSON

30 03 2007

Technologies Used: PHP 5, MySQL 4.1.2, JavaScript, JSON

Software Used: PHP Designer, Aptana, Firebug, Firefox

There are a wide variety of ways to handle JSON that is returned from a server. One of these methods is to provide a callback function pass it to the JSON emiting webservice via REST. When used with static or dynamic script tags, you are no longer restricted to Cross Site Scripting (XSS) limits requireing that the source be on the same server as your webpage. The XSS limit seems particularly ill-suited for mash-ups. The callback function method is what I have choosen for my examples. You will also see a static script tag for these examples. That is to simplify them and make them easier to read. The general preference is to use dynamic script tags rather than static ones.

Actual JSON string returned:

handleJSON({“Observations”:[{"Station":{"number":"1","lat":"29.5762","lon":"-98.7041","total_sp":"9","Species":[{"common":"Mourning Dove","number":"1","code":"MODO","scientific":"Zenaida macroura"},{"common":"Eastern Phoebe","number":"1","code":"EAPH","scientific":"Sayornis phoebe"},{"common":"Carolina Chickadee","number":"1","code":"CACH","scientific":"Poecile carolinensis"},{"common":"Black-crested Titmouse","number":"3","code":"BCTI","scientific":"Baeolophus atricristatus"},{"common":"Carolina Wren","number":"1","code":"CARW","scientific":"Thryothorus ludovicianus"},{"common":"Bewick's Wren","number":"2","code":"BEWR","scientific":"Thryomanes bewickii"},{"common":"Rufous-crowned Sparrow","number":"1","code":"RCSP","scientific":"Aimophila ruficeps"},{"common":"Northern Cardinal","number":"3","code":"NOCA","scientific":"Cardinalis cardinalis"},{"common":"Brown-headed Cowbird","number":"1","code":"BHCO","scientific":"Molothrus ater"}]}},{“Station”:{“number”:”2″,”lat”:”29.574″,”lon”:”-98.7036″,”total_sp”:”5″,”Species”:[{"common":"Black-crested Titmouse","number":"1","code":"BCTI","scientific":"Baeolophus atricristatus"},{"common":"Carolina Wren","number":"1","code":"CARW","scientific":"Thryothorus ludovicianus"},{"common":"Ruby-crowned Kinglet","number":"1","code":"RCKI","scientific":"Regulus calendula"},{"common":"Northern Cardinal","number":"1","code":"NOCA","scientific":"Cardinalis cardinalis"},{"common":"Brown-headed Cowbird","number":"1","code":"BHCO","scientific":"Molothrus ater"}]}}]});

It is not very human readable, but it is highly machine readable.

Below is an extremely simple example of handling JSON and doing something with it.

<html>
<head>
<meta http-equiv=“Content-Type” content=“text/html; charset=iso-8859-1″ />
<title>Simple JSON Handling</title>

<script type=”text/javascript” charset=”utf-8″>
function showCoords(jsd){
var lat = jsd.Observations[0].Station.lat;
var lon = jsd.Observations[0].Station.lon;
var msg = ‘Lat:’ + lat + ‘, Lon:’ + lon;
alert(msg);
}
</script>
</head>
<body>
<script src=http://www.plateauwildlife.com/bbc-mgmt/getstations.php?action=getdata&cid=2&year=2006&func=showCoords type=”text/javascript” charset=”utf-8″></script>
</body>
</html>


See It In Action

The above example is not particulalry usefull for anything other than demostration purposes. We want to actually DO something with our JSON to move us closer to creating the actual mashup. The number thing which I intially strugled with when using JSON & callback functions was that you MUST define the callback function BEFORE your dynamic or static script tag.

Google Maps, Google Earth, Yahoo Maps, and Virtual Earth all take HTML for the contents of the info window when you rollover or click on a point.

ArcWeb Explorer (AWX), however doesn’t take HTML as info window content. AWX does take styled text, videos, picture, audio, & swf for info window content. To this end, if you want to embed rich non-HTML content, AWX allows for some extremelly interesting content to be blended together and presented with great ease. The documentation for text styling is lacking, so creating simple content is actually more dificult in this platform than the others.

I’ve created a javascript file which we can reference in any of the HTML docs that actually embed the mashup.

This file contains

  1. The main callback function
  2. A function which builds an array of HTML tables containing the formated results from each station
  3. A function which builds an array of jscript strings containg the weakly formated results from each station

See The HTML Builder In Action

    //A global variable to assign the parsed JSON to

var jsobj;

 

//Main Callback handler.

//simple assignment to a global variable allows me to reuse and pass

//around the object without any server trips

function handleJSON(reply){

    jsobj=reply;

}

 

    function buildHTML(Observations){

    var info_win = new Array();

    // Build a table element with Station number & total species observed

    for(var i=0;i<Observations.length;i++){

        str = ‘<table border=”1″><tbody><tr class=”station”>’;

        str += ‘<td colspan=”2″>Station ‘+ Observations[i].Station.number + ‘</td></tr>’;

        str += ‘<tr class=”sta_total”><td colspan=”2″>Total Species – ‘ + Observations[i].Station.total_sp + ‘</td></tr>’;

       str += ‘<tr class=”obs_header”><td>Species</td><td>Number</td></tr>’;

        var details = “”;

        var arr = new Array();

        // assign each Species array to a local variable to reduce typing & increase readibility

        //build an string of <tr> elements containing the details of species observed

        arr = Observations[i].Station.Species;

            for (var y=0;y<arr.length;y++){

            details+=‘<tr class=”obs_detail”><td><a href=”http://www.google.com/search?q=%22′;

            details += arr[y].scientific.replace(/\s/,“+”);

            details += ‘%22″>’ + arr[y].common + ‘ (‘ + arr[y].code + ‘)</a></td>’;

            details += ‘<td>’ + arr[y].number + ‘</td></tr>’;

            }

        str +=    details;

        str += ‘</table></tr></tbody></table>’;

        //add table element to array of table element html strings

        info_win[i]=str;

    }

return info_win;

}

 

function buildAWXtxt(Observations){

        var info_win = new Array();

    // Build a formated text list for each Station number & total species observed

    for(var i=0;i<Observations.length;i++){

        str = ‘Station ‘ + Observations[i].Station.number + ‘\n’;

        str += ‘Total Species – ‘ + Observations[i].Station.total_sp + ‘\n’;

       str += ‘Species          Number’;

        var details = new Array();

        // assign each Species array to a local variable to reduce typing & increase readibility

        //build an array formatted text data elements containing the details of species observed

        //using this convention, we can assign a url property to each species line

        //through .data{elements[]} in the properties for each marker

        arr = Observations[i].Station.Species;

            for (var y=0;y<arr.length;y++){

            details[y] = arr[y].common + ‘ (‘ + arr[y].code + ‘) – ‘ + arr[y].number + ‘\n’;

            }

        var obs_info = new Array([str,details]);

        //

        //add table element to array of table element html strings

        info_win[i]=obs_info;

    }

return info_win;

}

 

Previous Parts

1. Emit JSON

 

Next Parts

3. Arcweb Explorer Mashup

4. Yahoo Maps Mashup

5. Virtual Earth Mashup

6. Google Maps Mashup





JetDbConnect – a class providing a simplified interface to JET OleDb compatible datasources using ADO.Net

26 03 2007

I needed a way to run SQL queries on structured text files & Excel files.

Download links available at end of post

For the last few months, I have been toiling away in the highly arcane world of fixed width text files. Very large ones at that (250-500mb). These come from text file data dumps of appraisal district databases. I need only a subset of these records. From there they go into Excel where they are cleaned up on a nearly line by line basis that can’t be handled consistently by RegEx since data is recorded so differently by each county. From there I want to summarize that data on a single household basis. I could go in Excel & do:

  1. Advanced filter, Unique Values (based on a unique household identifier)
  2. Use vlookup & sumif functions to copy and/or sum the data for each column

However, this is rather time consuming since different data is provided by in different order and level of detail for each county. Also, for rolls with more than 25k records and over 100 fields these Excel formulas can take ~30min to complete. You then need to copy & paste values or you’ll have to wait a long time anytime a recalc is performed. Needed a programmatic way to deal with this data and to minimize the human time to just records which required actual human eyes to modify.

Admittedly, creating an OleDb connection, command, reader & even datatable using ADO.Net methods is a pretty simple thing.

Dim connect As New OleDb.OleDbConnection
Dim strConnect As String = “Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\MyExcel.xls;Extended Properties=”"Excel 8.0;HDR=Yes;IMEX=1″”"
connect.ConnectionString = strConnect
Dim cmd As New OleDb.OleDbCommand(“SELECT * FROM [Sheet1$] WHERE City=’Austin’ AND LandValue > 30000″, connect)
Dim reader As OleDb.OleDbDataReader = cmd.ExecuteReader()
Dim dt As New DataTable
dt.Load(reader)

But I wanted something even simpler & significantly more flexible than that. I wanted an object that could handle any potential Jet datasource, field list, target table, and SQL Select statement. I wanted something that could be reused in a variety of different applications and not require any hard coding of these parameters or make me implement different ways to do this in each application. I needed a good object oriented design with high reusability. To this end I created the JetDbConnect.dll

Structure

Properties

  • Closed – OleDbConnection closed or open
  • ConnectionString (ReadOnly)
  • DataSource – the file path to your Jet compatible datasource. The ONLY property you must set either by assignment or through instantiation.
  • DbConnection (ReadOnly)
  • ExtendedProperties – if you need to use a Jet datasource that is not directly supported, here is where you would give the connection string parameters that would allow you to use it
  • StorageMethod – one of the 4 enums of the file types this handles directly (Text, Excel, Access, Dbase). Should never need to be set unless you have a really messed up filename.
  • TablesList – List of Strings of available tables in the datasource
  • TargetTable – The table from which to query data, only needed in Excel & Access. If not given for those then it picks 1st table on table list
  • TextFormat – enum: FixedWidth or Delimited. If not set assumes delimited with default delimiter set in your registry settings

Methods

  • New
    • () – Default
    • (DataSource as string, TextFormat Enum)
      • This is used ONLY with text-based datasources
      • DataSource = path to datasource
      • TextFormat Enum is one of the 2 members of this Enum
    • (DataSource as string, [TargetTable] as string)
      • This can be used with any datasource type.
      • If you use it with a text-based datasource, and it is not delimited with your default delimiter, then you will also have to set the TextFormat enum elsewhere BEFORE using one of the Command,Reader, or Records methods
  • Close() – manually close the connection. Not required but can be useful
  • CommandGetAll, ReaderGetAll, RecordsGetAll ([DataSource] as string, [TargetTable] as string)
    • All 3 methods take the same parameters, an optional datasource path and an optional target table name
    • These methods allow you to easily retrieve an OleDbCommand, OleDbReader, or DataTable containing all the records in your target table of your datasource
    • If you don’t set DataSource here and it has not been set elsewhere, this method will fail
    • If you don’t set TargetTable here and it is an Access or Excel datasource and TargetTable is not set elsewhere, then “Sheet1″ or the first member of TablesList will be used for Excel or Access datasources respectively
  • CommandGetSql, ReaderGetSql, RecordsGetSql (sqlWhere as string, [fields] as string, [DataSource] as string, [TargetTable] as string)
    • All 3 methods take the same parameters:
      • Required: sqlWhere string, ex. “zipcode=’78258′ AND land_value>10000″
      • Optional: fields. A comma delimited list of fields to return with the query. ex. “owner_name, zipcode, land_value”
      • Optional: DataSource & TargetTable – see above for how the default values are handled for these 2 parameters
    • These methods allow you to easily retrieve an OleDbCommand, OleDbReader, or DataTable containing just the records in your target table of your datasource which meet your SQL criteria and optionally to return only the specified fields
    • If you want ALL the fields, pass Nothing or Void for fields. An empty string will cause an error.
  • SchemaTable
    • Returns a Schema DataTable for the DataSource not a TargetTable.
    • If you want a Schema DataTable for a table then get it from an OleDbReader on that table
  • FromXml, ToXml
    • Provides methods for serializing & de-serializing the object.
    • Since only DataSource is required for text & Dbase files and DataSource & TargetTable for Access and Excel, I don’t see these methods as being particularly useful. However, I always like to implement them just in case I need to persist an object or compare an object to a previous state.

Usage

Examples:

Get an OleDbReader for all records in an Excel file from a named sheet

dim jetdb as new JetDbConnect(“C:\myfiles\ExcelFile.xls”, “Hays County”)

dim reader as OleDbReader = jetdb.ReaderGetAll()

-or-

dim jetdb as new JetDbConnect

dim reader as OleDbReader = jetdb.ReaderGetAll(“C:\myfiles\ExcelFile.xls”, “Hays County”)

note: the class automatically adds the required decorations so that Hays County becomes [Hays County$] as required for SQL statements on an Excel datasource

Get a DataTable with select records & fields from delimited text file

dim jetdb as new JetDbConnect(“C:\myfiles\some text file.txt”)

dim dt as DataTable = jetdb.RecordsGetSql(“zipcode=’78258′ AND land_value>10000″,”owner_name, zipcode, land_value”)

note: I did not need to ever specify the TargetTable attribute since it is automatically set to “some text file.txt” and the Data Source is actually set to “C:\myfiles”

Get a list of available Sheets and Ranges from an Excel file without the need for any Office libaries or applications

dim jetdb as new JetDbConnect(“C:\myfiles\ExcelFile.xls”)

dim tables as List(of string) = jetdb.TablesList

Downloads

Just the DLL

The Visual Studio 2005 project