Heurist Training – August to October

Our first month of Heurist training is complete and the schedule for the next 3 months (August to October) has been posted. Join those who have had their Heurist questions answered and are fast becoming confident users and short-cut hours of working things out on your own by signing up for one of these courses.

We also offer one-on-one training for those whose skill set falls in between the courses offered, or who have specific issues that they wish to concentrate on. Email Claire for details.

Dependencies in Heurist

Relationships between records in Heurist build dependencies between these records. Dependencies are an important concept, not only for creating an efficient structure for a Heurist database, but also for managing CSV file imports. The use of the term “dependency” in Heurist is similar to the use of the term in computer science meaning a situation where a statement in a program refers to another statement. We can therefore also use the term “linkages” for dependencies. Relationships between records set up linkages between these records. It is a “dependency”, because in order for each specific link to exist, the records become dependent on each other – obviously, you cannot link records which do not exist. Thus, two records become dependent on one another because of the relationship between them.

Records can be linked in Heurist in a number of different ways. These linkages can also be hierarchical – with one linkage referencing a record, which is itself linked to other records – forming a web of linkages. Heurist contains visualisation tools, such as Network Diagram (in the Filter-Analyse-Publish screen) which can help us to see the linkages between records in our database. In the screenshot shown above, the records from the Shakespeare play, “Macbeth’, are shown with some of their linkages. Single arrows reflect records linked by Pointer fields and double arrows reflect records linked by Relationship Marker fields.

The Visualise tool in the Manage screen can also help us to understand the linkages formed by the structure of the record types in our database. In the screenshot shown above, some of the record types associated with Shakespearean plays are shown as a web of linkages. You can see that some record types link back to themselves (for example, a Person record can be related to another Person record – “ChildOf” etc. Characters can also be related to each other in similar ways). These visualisation tools help to conceptualise the dependencies that exist within a Heurist database.

Dependencies mainly become important during the process of importing CSV files. During the import process, one is asked to define the primary record type. This is the main record type for which you are importing data. Once one has selected the primary record type, one is asked to define any dependencies that may be present in the data for this import file. Note, that the selection of dependencies is specific for each import file. One can not import dependencies for which one has no data, similarly, if you fail to select a dependency for which there is data, Heurist will have no way of matching that data.

Importing Dependencies

Let us consider this with a specific example. Let us suppose that we are importing a table of data on Shakespeare’s plays into the database shown in these examples above. We have information in a spreadsheet detailing aspects about each play and we have exported this to a CSV file. Since we are importing into the Play record type, we need to consider the dependencies which may be present related to the Play record type. If we consider the Visualise diagram above, we can see that Play has possible linkages (dependencies in the case of an import) to the Character, Person, Role, Mythological Element and Historic Event record types. If our dataset includes the names of Plays and Characters in the Plays, then we will add Characters as a dependency of Play. In fact, in the example given here, Character is a child record of Play, thus it is critical that we add the Characters as dependencies of Play records, so that we don’t create orphans. For the sake of clarity, and being able to easily correct errors in the original dataset during the import process, we strongly recommend that large datasets are subdivided into smaller CSV files. It is preferable to deal with one dependency (or group of dependencies) at a time. This is also necessary since there may be more than one potential match to Heurist record types in the original dataset.

For example, one might have a column in a spreadsheet that has the names of Persons as the Playwright, and another that has the names of Persons as Actors playing Characters. If these are imported at the same time, Heurist will not be able to differentiate which Persons should populate the Playwright field and which should populate the Actor field. It is therefore best to split this into (at least) 2 distinct files – one of which populates the Playwright field and another of which populates the Actor field. Below is an example of data to populate the Play – Playwright section. Note that here the Person record (which will populate the Playwright field) is a dependency on the Play record. We can make it even more complicated by having data on Place of Birth. The Place of Birth field points to a Place record, and thus we have a hierarchy of dependencies: Place is a dependency of Person, which in turn is a dependency of Play. Whilst none of this data is a Required field, in order to populate the Person and Place pointer fields, it is required that we import data for those fields – thus making the fields into dependencies for this particular import. Note that in order to record multiple plays by the same playwright, the information for playwright is repeated in each row (see CSV data in the background of the screenshot).

In the example shown in the screenshot above, you can see both the contents of the CSV file in the background on the left and the dependencies selected in the popup. Our Primary record type in this example is Play, because we want to add new Play records and populate the “Playwright” pointer field in the Play records. Note that Playwright pointing to Person is the first listed dependency under the Play record type. On the righthand side of the popup you can see that the pointer Playwright to Person has a Heurist ID, which is written in a greenish yellow font: Playwright H-ID. Looking down the list of dependencies, we can see that the Playwright to Person pointer is listed as the fourth dependency (after Characters, Historic Events and Place). Therefore we check the box next to the Playwright to Person dependency. Playwright to Person itself has further dependencies (pointers or linked records). The dependencies relate to fields in the Person record type. These fields are Life Events, Associated Person, Place of birth and Place of death. In our example we have data to populate the Place of birth field in the Person record. Note that the Place of birth field is a pointer to a Place record. It has its own Heurist ID (Place(s) H-ID). Since we wish to populate this field, we need to select the dependency to do so. Thus we look further down the list to find the Place(s) pointer to Place and check the box. We choose this dependency rather than the Place pointer to Place listed above Playwright because on examination of the Heurist IDs on the right, we can see that Place (without the (s) added) is a pointer from Historic Event and will record the Place associated with the Historic Event. We do not wish to record any places associated with historic events, so we have to rather choose the dependency associated with Place of birth, which is Place(s).

Since we do not want to import any other information at this time, we do not check any other dependencies. This is also very important. If we select any other dependencies, we must have data to populate those fields in our current CSV file. We will not be able to proceed if that information is missing. Therefore, we choose only the dependencies in the data in the current CSV file being imported, and no others.

Heurist will now start at the most fundamental point in the hierarchy of dependencies to import data. The first dependency to to imported will therefore be the Place(s) records for Place of birth. These records are a dependency of Person which in turn is a dependency of Play, so Place(s) needs to be imported so that Person can be imported, so that Play can be populated. Heurist will import data for each of the dependent records in turn, before creating the linkages between all these records (obviously, it cannot do this the other way round). We will therefore be asked to match and then import our Place of birth records first (see screenshot above). Secondly, we will be asked to match and import our Person records to populate the Playwright fields. We will import both Family Name and Given Names for the Person records. In this case, William Shakespeare with a birthplace of Stratford Upon Avon is already a record in the database, so Heurist will match the record and proceed immediately to Play without any import being necessary. Otherwise we would import the fields for Person, including the pointer field to Place of birth. We now match our Play records to the Title of Play field. In this example we have 3 new Play records which will be added to the database.

Once we have matched the Play records, we now need to populate the fields of the Play record type and this includes our pointer to Playwright (see above). This will create the links between the Person record and the Play record, through the Playwright field, as well as populating 3 new Plays into the database (all with the same Playwright, in this case). The process would be the same if we were dealing with different Plays by different Playwrights, since it is matching the previously imported “Playwright H-ID” records to the Playwright record pointer field. Once we have run the Insert/Update, we will have 3 new Play records in our database, each of which points to the Person record, William Shakespeare, as the Playwright (see below).

Now, let us presuppose that our dataset also includes People who have acted in certain Roles as particular Characters in particular Plays. We don’t import that data at the same time as the Playwright data, otherwise Heurist can’t distinguish which Person records are Actors and which Person records are Playwrights. Therefore, it is best to create a second CSV file, which contains the Actor and Roles information. This time our Primary record type will be Role, not Play. There is no pointer from Play to Actors and Roles, rather the Role record type points back to Play, as well as to Persons as Actors, Directors etc and to the Characters played (where appropriate).

Note that when we choose Role as the primary record type, Person is a Required field and therefore the dependency Person below is automatically checked. We will also check the dependencies for Play, Characters and Place(s) for Place of Birth. In this example, Play is the most fundamental record type and therefore it gets imported first. Heurist builds the order for import as first Play, then Place(s), then Person, then Character, then Role (see screenshot below).

Once Place(s) is defined, we can subsequently populate the record pointer for Place of birth in the Person record. Once Play, Person and Character are defined we can then populate the relevant fields in Role.Thus each of these records is dependency of the records which point to them and need to be imported or matched before the final linkages are made. The results of the import process are displayed in the screenshot below. Compare this to the CSV data which was imported, which can be seen in the background of the image of the checked dependencies shown higher up this page. In this screenshot we have opened each of the individual records (highlighted in blue in the Record View screen) – viz. Play, Person (with Place of birth – Waterford, Ireland, in this example) and Character. The web of linked records is clearly indicated by the blue highlighted fields in each of these records – for example, there are links to all the people who have played the role of Macbeth, from the Character Macbeth; there are links to all the Characters from the Play, Macbeth. The advantages of these linked records for facilitating analysis and management is readily apparent.

Whilst Heurist can easily manage the systematic import of complex dependencies, we strongly advise that you simplify this process by subdividing your data into smaller files for import. As can be seen by the example above, the complexity and hierarchy of dependencies can grow quickly and thus, to ensure that the process achieves the results desired, it is best to ensure that you understand the structure of the data being imported.

Creation of Linkages

It is also critical that the import process is completed. If the import process is aborted halfway through the process, then whichever records have been imported to that point will exist in the database, however, the final linkages to the primary record type will not have been made, as this happens in the final step of the import process, once the records for linking are complete. This can lead to records which are meant to be linked to other records, but instead float in isolation. If you need to abort the import process in order to fix up something in either the CSV file or the structure of the record types, do ensure that you return to the import to complete the process (the import wizard will return you to the point at which you aborted the process).

Training for Heurist!

ArcheFact has become an authorised provider of support for Heurist! Heurist is a knowledge management system designed especially for the Humanities, started as a project at the University of Sydney in 2005. It is now widely used by Humanities researchers in a range of disciplines around the world. Heurist is built on MySQL, an open source relational database management system. Heurist’s code is also open source and available on GitHub. There are instructions on the Heurist Network website for those wishing to install a separate instance (expertise in server installations using Linux, Apache, PHP and MySQL is strongly recommended, alternatively, ArcheFact can advise and assist in the setup of such instances, contact us for details).

Heurist is an excellent system for managing data in any area of the Humanities, especially Cultural Heritage. Designed by Dr Ian Johnson, an archaeologist, Heurist excels in managing temporal and spatial data in particular. Using an interface similar to Google Maps, but enhanced by a timeline populated by date fields in the database, spatial and temporal records are displayed in a visually pleasing manner (see screenshot above). Not only can one digitise directly into the map interface, but one can also import KML files or coordinates in decimal degrees (longitude and latitude) and UTM. Temporal records are even more diverse, allowing for approximate dates, date ranges (as shown on the timeline above), radiocarbon dates (with standard deviations), use of 9 different calendars (including Julian, Islamic, Mayan and Hebrew calendars), as well as actual times (with time zones).

Another advantage of Heurist is the ability to quickly and easily access linked records. There are a variety of ways to link records together and even an option to include vocabularies to describe the relationship (e.g. “IsSonOf” or “IsStratigraphicallyBelow” or even “MayBeTheSameAs”). Linked records are displayed in blue text, which can be clicked to open the linked record in a popup window (see screenshot above). Linked records can include images, scanned documents, other records, etc. All linked records refer to each other. Thus if one links records to a bibliographic source, for example (Heurist can sync with a Zotero bibliography), then viewing the bibliographic source will also show all other records linked to that source – an invaluable tool for research! (see screenshot below)

This is just a small taste of the amazing power of Heurist, which can completely transform your research! So explore Heurist today and then sign up for one of our online training courses to get you off to a flying start! Contact us for more details.

Quantum GIS

Quantum GIS (QGIS) is open source GIS software, available for free download. It runs on Windows, Mac, Linux and Android. QGIS was started in 2002 and from 2007 is supported by the Open Source Geospatial Foundation (OSGeo). OSGeo is a very active global community of people and organisations involved in the geospatial industry and with their support QGIS rapidly grew into a powerful GIS. It is now available as version 2.6 and is becoming a true disruptor to the GIS market. As with other open source projects, one of the main advantages is that the contributors work in the industry and thus there are many little tweaks that exactly match a user’s needs. If you find yourself thinking “hmm, I’d like to …” chances are someone else has had that exact thought and has already implemented it. QGIS is licensed under the GNU GPL and thus can be modified and openly contributed to, as well as being free to download and use.

QGIS has a similar look and feel to the Arc GIS suite, and is at least as sophisticated in its display and editing functions, if not more so. It has a range of advantages over most other GIS, not least in terms of cost. In usability it is also advantageous. It integrates with other programs such as PostGIS, GRASS, and MapServer and has an extensive list of plugins. The plugins cover a vast range of functionality, everything from automatically grading colours across a map to searching layers and geocoding. One of our personal favourites is the integration with OpenStreetMap and Google Maps layers, providing satellite or street map layers directly within the GIS. One can thus display one’s own GIS layers directly over the OpenStreetMap layer and it will zoom in and out with your own data. For cultural heritage purposes this is extremely useful.

In terms of formats, QGIS imports and exports shapefiles (Arc GIS), kml/kmz (Google Earth), dxf (AutoCAD) and many other formats. A personal fave is that the export to Google Earth functions carries across all the selected fields instead of restricting you to just “Name” and “Description” fields. Once in Google Earth, you can see all the associated data that you exported in a neat table for each point or polygon. Of course, as with other GIS, you can import from text and spreadsheet files too and generate layers.

The only part that lags slightly in functionality is the Map Composer section for creating printed maps, however, this is currently a focus of attention from the developers and it is likely to once again overtake the other products in functionality very soon. To be fair, we haven’t created printed maps from QGIS for months, and certainly not in the latest version, and particularly with active open source software, there are usually lots of advances in the space of a few months.

Overall, we can’t recommend QGIS highly enough. It provides a stable, functional, powerful GIS, as well as access to a community of other users, can be incorporated into any system and you never need to worry about license payments or upgrades again! This is why we choose to use it for our own work and offer training to anyone else wanting to do the same.

AAA conference in Coffs Harbour

The 36th annual Australian Archaeological Association (AAA) conference was held in Coffs Harbour between 1 and 4th December 2013. Claire drove down and presented a paper on the influence of climate change on changing settlement patterns in the Gulf (which is related to her PhD research).

The conference was extremely interesting and enjoyable. President of AAA, Pat Faulkner, opened the conference and Uncle Mark Flanders did a very moving Welcome to Country on behalf of the Garlambirla Guyuu Girrwaa Elders. Sessions then commenced covering a range of interesting topics, starting with an opening keynote by Doug Comer on “The Strategic Value of Best Practices for Archaeological Heritage Management”.

Doug used his considerable experience with ICOMOS and the field of CRM around the world to discuss pressures on heritage management now and in the future, as well as encourage everyone working with heritage to address these issues in the language of economics and politics, and thus be understood by those responsible for legislation and large projects.

Parallel sessions meant the usual challenges of deciding which papers to attend, but there was a broad coverage of interesting topics, with papers largely grouped in specialities. Highlights were the social media session with 2 papers presented by remote via Google Hangout, with presenters physically in the UK, US and France. The Archaeological Data Management session also used Google Hangout so that Ian Johnson could present on Heurist and Brian Ballsun-Stanton could present on FAIMS (along with Shaun Ross, who was physically present) from Sydney. That session was also streamed live online and is available on YouTube here. We encourage anyone interested in any aspect of archaeological data management to watch it.

Another highlight was the FAIMS workshop, which was also recorded and can be accessed here. Although we struggled with the limitations of a very flaky and weak wifi signal, we managed to download the app and access the Heurist module builder for FAIMS. FAIMS is progressing very well. Adela demonstrated the GIS capabilities outside (so we could access GPS as well) and it was very impressive indeed. We strongly urge everyone to have a look at the wiki, give it a go and contribute to this incredibly worthwhile project! Well done all at FAIMS!

All in all a very worthwhile experience and source of a lot of food for thought. We look forward to next year’s conference already!

Heurist

Heurist is an “academic knowledge management system” developed at the University of Sydney and is the brainchild of Dr Ian Johnson. It draws together Ian’s experience with archaeological data management systems as well as incorporating aspects of his research into best methods of managing time data in archaeology. Ian developed a system called TimeMap, to integrate concepts of time with mapping data. Many of the ideas developed in TimeMap are incorporated into Heurist.

Heurist is therefore has the potential for being a good fit for people dealing with science and humanities data, as it has unique capabilities to manage both time and space built in to its architecture. Heurist has a different approach to management of data to many other systems. Backed by a powerful relational database (MySQL or MariaDB), the Heurist interface allows one to deal with the data in a way that makes sense to those of us working with science and humanities data. Locational data is displayed on a built-in Google maps interface and basic editing (creation of points, polygons etc) can be done directly into the database if required. Time data can be entered as calendar dates, approximate dates or radiocarbon dates (with confidence intervals) and time lines are automatically generated for all records with dates.

Records can be linked through a series of pre-defined relationships, which are then indicated on viewing the data. So, for example, a record concerning a site (displayed with its location marked on the map and a timeline of applicable dates) will show pointers to other records to which it is linked (for example photos or reports of the site). Heurist includes a bibliographic suite of record types to manage bibliographic references, which can be linked to any other records in the database. Other record types are available where previously defined in the system by other users, or of course one can add one’s own specific record types.

Users can share access to databases, managed by access controls, which allow differential access to different parts of the database, if required. Features within the database allow users to share information easily with each other (for example messages noting what records have been added etc) and even maintain a blog within the database.

Heurist can import and export a range of formats and is scalable from as small and simple as you need to capable of handling vast and complex datasets, seamlessly and without any need for porting the data from one system to another. Heurist has been successfully used on a range of projects over the past 7 years. It provides the database behind the Dictionary of Sydney project, a 3D imagery project on Gallipoli for the ABC, the ARC-funded Digital Harlem project and another ARC project on the history of Balinese paintings.

A Heurist database was developed for the Ministry of Culture in Bahrain for their World Heritage Nomination of the Pearling Testimony of Bahrain. The project has since been listed as a World Heritage Site.

Heurist is available as open source software (see the code repository on Google Code), for installation on a server. Some technical expertise is required to install it. If you are interested in using Heurist, please contact us and we can advise on the best implementation for your needs. ArcheFact is an active contributor to Heurist (code, fixes and other expertise) and we have our own instance available for clients.

Heurist and FAIMS are now collaborating so that Heurist can be used as a database to manage the data collected with the FAIMS field recording mobile app. This promises to be a very powerful combination, which we look forward to implementing for our clients. The combination generated a lot of interest at the CAA 2013 conference in Perth.

FAIMS mobile app

An exciting update at the CAA conference was the Federated Archaeological Information Management Systems Project (FAIMS) and in particular, their field recording system for mobile devices. FAIMS is a federally funded project led by the University of New South Wales in conjunction with 41 other organisations around Australia and worldwide, which aims to “create a digital infrastructure for archaeology“.

FAIMS has done an extensive study of what’s already available, both in terms of mobile apps for field recording, as well as running GIS on android devices, and has seen what works and what doesn’t. The project will have a working mobile app for android, specifically tailored for Australian archaeology, but flexible enough to be used for almost any related purpose, available before the end of the year. Prototypes are already available.

What they showed us at the conference looks amazing – a fully customisable interface with, in addition to standard data entry (including GPS capture, photos etc from internal or external devices), options for setting how sure one is of a designation, and mini “dictionaries” of e.g. different ceramic styles/types so that field operatives can pick the closest and in effect say “it looks kind of like this” where “kind of” is optionally specifiable as well. FAIMS is committed to integrating GIS on-the-fly and making sure the whole thing works off-line (which is where most fieldwork takes place, at least in Aus).

As well as an American archaeological database system called TDAR, FAIMS is now also working closely with Heurist. Heurist is a brilliant archaeological database, integrating both time and space in new and novel ways, perfect for archaeologists, and allowing one to structure one’s data in a very effective way. The FAIMS/Heurist combo looks to be very powerful for archaeologists and should put us at the leading edge of archaeological information systems worldwide.

Watch this space for news of further developments!

CAA 2013 was awesome!

The CAA conference in Perth was a fantastic time of catching up with old friends and finding out what’s new and what’s hot in the realms of archaeological computer-ing! 😉

There was a great selection of keynote speakers and good info in a lot of the sessions. As usual the hallway track was where most of the fundamental work was done, although several discussion sessions (including the ones on FAIMS and GIS/consulting, which were incredibly useful) also provided a good overview of where things are going and what we all think about it.

Catching up with Ian Johnson was awesome, his Heurist project has come a long way and was released under an Open Source license just before the conference. I had already started installing it the moment it was released, but during the conference I worked on it some more as obviously having Ian there was a great asset. Together we were able to identify and work through most issues that enabled me to create a working system.

While it might stress out some people to not yet have a slick packaged product that “just works”, I regard it as perfectly normal for a project that has been internal to require some extra work and feedback when it meets the big wide world. Typically this is something (other) companies provide and that’s one of the key things we’re looking at (providing hosted Heurist for clients). It’s not the first open source project I’ve been involved in. I’ve made several code contributions since, which Ian and his team have incorporated. Their responsiveness is a great sign that they understand the Open Source development model and its benefits.

It was really great to see the strides that are taking place with Heurist and FAIMS and their interactions with each other. The use of mobile devices for recording in the field is definitely the new hot idea and it was particularly useful to hear people talk from experience about what works and what doesn’t. More on that later. Of course, as always, the database behind it is absolutely critical and that’s where Heurist comes into its own and leaves everything else way behind. More on that soon too.

Favourite quote from the conference: Eric Kansa‘s keynote, “many archaeologists are using sophisticated data models using the wrong tools, such as Excel.” Over 10 years ago I declared “Arjen’s rule#1: A relational database is not a spreadsheet” which still gets quoted today. It’s as relevant as it was then. More on that too, soon.

ArcheFact at CAA 2013 in Perth

Claire and Arjen will be at the Computer Applications & Archaeology (CAA 2013) conference at the University of Western Australia in Perth, 26-28 March 2013. We hope to meet or catch up with you there!

Claire is also chairing a session on the use of GIS in cultural heritage consultancy work.