Watch and Learn Watch our “how to” videos to accelerate your skills with the JSON connector
1. Consuming JSON data from a URL source
2. Consuming JSON data from a FTP folder source (Streaming or Static)
3. Consuming JSON data from a static REST Source
4. Working with JSON data that has separate Lat/Long keys
5. Setting up your JSON data source – JSON Formatters
6. Consuming SODA with the Yellowfin JSON connector 7 | P a g e
SODA Search Engine https://dev.socrata.com/data/
Here are some JSON dataset links to get you started
1. (URL) US Geological Survey - Earthquakes https://earthquake.usgs.gov/earthquakes/feed/v1.0/...
2. (REST) New York Prison Inmate intake (March 2017) https://data.cityofnewyork.us/resource/gqrb-77i6.j...
3. (REST) Location of Blueberry Farms in the US state of Connecticut https://data.ct.gov/resource/y6p2-px98.json
4. (URL) NASA Meteorite landings https://data.nasa.gov/view/ak9y-cwf9
JSON Data Sources
The connector can access JSON formatted data from any available URL. A good example is the US geological survey earthquake data https://earthquake.usgs.gov/earthquakes/feed/v1.0/...
The data may be static, or, in the case of the earthquake data, the url remains the same but the content is updated every 15 minutes. This allows the connector to provide constant fresh data views.
For example; If a user maps the location of earthquakes above 4.5 magnitude, in the last 7 days; each day they log into their dashboard, they will notice new incidence without having to alter the parameters of the report.
FTP is standard plumbing everywhere, it is robust and well defined. But it is slow. For each file read, the FTP protocol opens and closes a TCP/IP connection. It is faster to process a single large file, than thousands of small files. However, for near real time or after hours processing it is reliable.
The connector can connect to a straight FTP source or alternatively to a sFTP (more secure FTP) source.
The speed of your FTP server and the network lag, will determine processing performance.
The connector supports file streaming. This is, it can continually look at the folder and process new files arriving on a DELETION or TIMESTAMP strategy.
In DELETE mode, the connector will read a file, process the JSON contents, Store the contents and then delete the original file from the source folder. DELETE mode is clearest way to judge processing, as the only files left in the source folder will be files yet to be processed by the connector. This mode is best for internal company data, where the source files are a mirror copy of a source repository.
In TIMESTAMP mode, the connector keeps track of the files it has processed from the source folder. It will process the files in FIFO order and will not alter or delete the source files. TIMESTAMP mode is recommended, when the source files must not be altered by the connector or any other 3rd party application.
We have provided a LIMITER function, so that users can set the quanta of work for each scheduled run of the report. Hence, if you have 1000 files to be processed, you may set the limiter to 100 and increase the frequency in which the report is run. In this way, each time the report runs it will process only 100 waiting files, resulting in updates to the report. This strategy works well for file streaming or having a large static number of files to process; while still being able to get some report insights.
The connector can process data from a REST service.
The connector in the current version does not support data storage and pagination. This will be delivered in a future version. The difficulty with pagination is that there is no universal approach to pagination; often with each service taking a different approach.
In this version, the connector can continually connect and process JSON formatted data from a REST service; but it is limited to the full data set it can access in a single call.
For very large JSON record sets, this will have a performance impact; We therefore, recommend that the user, utilizes filters in the parameters to select only the data topics of interest.
For example. With SODA, it is possible to select a dataset of all crime in New York from 2001 to 2017. This would be a very large dataset. If the user if only interested in “Car theft” for the years 2010 to 2017. We recommend that the user applies date and crime type filters on the REST call to reduce the dataset returned and ensure good performance.
All SODA sources offer a comprehensive set of filters and other REST services generally have limit clauses to reduce the amount of data selected.
JSON Data Types
The JSON designer available in the connector, is a service to allow users to perform two main functions
1. PICK only the data elements that need
2. Set the correct type on the data
The designer will parse the complete (presented) JSON record set and provide the list of available keys. The user may select manually the keys to include from the list. Alternatively, the user may choose to select all keys and then delete the keys not required.
The designer offers key options and can manage nested JSON record sets. When a user selects all keys automatically, the designer will pre-populate the column names with a CamelCase key path. This may be altered by the user overtyping the column name. The designer can manage numeric values in native JSON numeric forms or numeric values that are wrapped as text. Users may alter a numeric value between integer or Numeric TYPE.
The designer can manage several date formats common in business and scientific JSON records. The designer, allows the user to propose a value is a DATE and then provided the exact format from a list of common date/time formats. If a common format is not available, the user may build the format with standard regex values.
The designer can manage several co-ordinate formats common in business and scientific JSON records. The key abilities of the designer and connector is the ability to identify and format POINT formats (GEOJSON), or to combine separate latitude/longitude values into a POINT. For POINT values the user must convert to a WKT GIS format inside Yellowfin to realize the POINT for mapping charts or Google maps.
For more details review the attached PDF for specific setup and an FAQ section.