Tag Archives: OpenSource

Dealing with Factors in R

What is the deal with the data type “Factor” in R?  It has a purpose and I know that a number of packages use this format, however, I often find that (1) my data somehow ends up in the format and (2) it’s not what I want.

My goal for this post: to write down what I’ve learned (this time, again!) before I forget and have to learn it all over again next time (just like all the other times).  If you found this, I hope it’s helpful and that you came here before you started tearing your hair out, yelling at the computer, or banging your head on the desk.

So here we go.  Add your ways to deal with factors in the comments and I’ll update the page as needed.

Avoid Creating Factors

Number 1 best way to deal with factors (when you don’t need them) is to not create them in the first place!  When you import a csv or other similar data, use the option stringsAsFactors = FALSE (or similar… read the docs for find the options for the command you’re using) to make sure your string data isn’t converted automatically to a factor.  R will sometimes also convert what seems to clearly be numerical data to a factor as well, so even if you only have numbers, you may still need this option.

MyData<-read.csv(file="SomeData.csv", header=TRUE, stringsAsFactors = FALSE)

Convert Data

Ok, but what if creating a factor is unavoidable?  You can convert it.  It’s not intuitive so I keep forgetting.  Wrap your factor in an as.character() to just get the data.  It’s now in string format, so if you need numbers, wrap all of that in as.numeric().

#Convert from a factor to a list
CharacterData<-as.character(MyFactor)

#Convert from a factor to numerical data
NumericalData<-as.numeric(as.character(MyFactor))

 

What’s Missing?

Do you have any other tricks to working with data that ends up as a Factor?  Let me know in the comments!

Advertisements

Making of a Moon Tree Map

29_CleanUp.png

I’m presenting a workflow for finishing maps in Inkscape at FOSS4G North America this year (2016). To really show the process effectively, I made a map and took screenshots along the way.

The Data

I decided to work with Moon Tree location data.  It’s quirky and interesting… and given that this is a geek conference I figured the space reference would be appreciated.  A few months ago I learned about Moon Trees watching an episode of Huell Howser on KVIE Public Television and then visited the one on the California State Capitol grounds.  I later learned from my aunt that my grandfather was a part of the telemetry crew that retrieved the Apollo 14 mission that carried the seeds that would become the Moon Trees, so there’s something of a connection to this idea.  Followers of my research also know that I’m a plant person, particularly plant geography.  So this seemed like the perfect dataset.  I was fortunate to find that Heather Archuletta had already digitized the locations of public trees and made them available in KML format.

Data Processing

The KML format is great for some applications (particularly Google maps, for which it was designed) but it poses some challenges.  I spent several hours… maybe more than I want to admit… formatting the .dbf to make the shapefile more useful for my purposes.  I created columns and standardized the content.  The map does not present all the data available (um… duh.).  It was challenge enough getting all this onto one page.

Yes, Inkscape is Necessary

You can’t make this map in QGIS completely.  I mean, normally you can make some fantastic maps in QGIS, but this one is actually not possible.  Right now, QGIS can’t handle having map frames with different projections.  I tried, but I found that even when the map composer looked right, the export in all three export options changed the projection and center of each frame to match that of the last active frame.  So I ended up with a layout with three zoom levels centered on Brazil… interesting, but not what I had in mind.  So I exported an .svg file three times from the map composer – one for each map frame – and put them together in Inkscape.

Sneaky Cartography

One of the methods I often use in my maps is to create subtle blurred halos behind text or icons that might otherwise get lost on a busy background.  I don’t like when the viewer sees the halos (maybe it’s from teaching ArcMap far too many years at universities).  It’s not quite a pet peeve, but I think there’s often better ways to handle busy backgrounds and readability.  My blog, my soapbox.  Can you spot them?  There are a couple in the map and in the final slide of the pitch video.  It doesn’t look like much, but I promise the text is easier to read.

The texture on the continents is the moon.  I clipped a photo of the moon using the continent outlines.  I liked the idea of trees on the moon.

Icons

The icons are special to me.  I’ve been really wanting to make a map using images from Phylopic and I thought this was the perfect opportunity… but… but… no one had uploaded outlines for any of the species I needed.  So I made them and uploaded them.  So if you want an .svg of these, help yourself.  If, however, you need dinosaurs, they’ve got you covered.

Watch it happen:

My pitch video captures the process from start to finish:

 Want more open source cartography?

Come to FOSS4G North America and see my and several other talks focused on cartography.  I’ll cover methods and tools in Inkscape common for cartography.


Spatially Enabled Zotero Database

As a geographer, I’m a visual person.  I like to see distributions on a map and where things are matters to me.  A few years ago, while I was writing a paper I became overwhelmed with trying to remember the locations for the studies I had read (for coastal plants, latitude matters), so I started marking the locations of studies on a map and eventually turned it into a printed map.

CGS_Smaller

But adding new studies and sharing the results is a cumbersome and the spatial data is largely separate from the citation information.  So I set out to find a way to store spatial information in my citation database and access the spatial information for mapping purposes.  The end result (which is still a work in progress at press time) is a web map of coastal vegetation literature that updates when new citations are added to my Zotero database online.

Thumb_LiteratureMap

How I Did It:

Key ingredients: Zotero, QGIS, Spatialite, Zotero Online Account

I started working with the Zotero database I already have populated with literature relevant to my research on coastal vegetation.  I moved citations that I wanted to map into a separate folder just to make the API queries easier later.  I made a point in a shapefile for the location of each study using QGIS.  I gave the attribute table fields for the in-text citation and a text description of the location for human-readability, but the most important field is the ZoteroKey.  This is the item key that uniquely identifies each record in the Zotero database.  To find the key for each citation, in your local version of Zotero, right click on the record and pick “generate report”.  The text for the key is after the underscore in the URL for the report.  In the online version, click the citation in your list.  The key is at the end of the URL in the page that opens.

QGIS_Screenshot

My map only has point geometries right now, but that will change in the coming weeks.

The spatial information was then to be added to the Zotero database (specific queries can be found on GitHub) in Spatialite.  The Zotero schema is quite large but not impossible to navigate.  Currently, there is no option to add your own fields to Zotero (I tried… I failed… they tell me the option is coming soon) so I put my geometries into the “Extra” field.  Using Spatialite, I opened the Zotero database and imported my shapefile of citation locations (having new tables doesn’t break the database, thank goodness).  Then I removed any existing information in the “Extra” field and filled it in with geometry information in the style of geoJSON.  The string looks like this:

{"type": "Point", "coordinates": [-123.069403678033, 38.3159528822055]}

After updating the citation records to house the geometries, I synced the changes to my online Zotero repository from my desktop program.  Now it’s ready to go into a web map using the Zotero API.  My webmap code can be found in my GitHub Repository.

What’s Next?

I would like to develop a plug-in for QGIS that makes adding the geometries to the Zotero database easier because not everyone wants to run SQL queries on their active citation database that has been years in the making (I backed mine up first!).  The interface would show the citations you want to map, then users would pick a citation, then click the location on their QGIS project where the citations should be located.  The plug-in would insert the corresponding geometry for them.


Aerial Photography: Fly Along at Coal Oil Point

One day I was flipping through digital photo files that came from my hot air balloon photography rig when I realized that it almost looked like a video from the rig’s perspective.  That got me thinking about making a slideshow of pictures.  I found an open source tool on Source Forge called FotoFilmStrip and made a file with a set of photos with the same exposure (my dataset has 3 different exposures).  The program was fairly intuitive and easy to use, and like all open source projects, the price is right.  I was hoping that viewing the photos in succession like this would help me make some sort of new observation that wasn’t obvious before, but to be honest the only real benefit I can see is that it’s fun to watch.

Perhaps I’ll make slideshows for the other photo sets.


Aerial Photography: Dog Beach

Last summer I made a trip to San Diego for research and packed my kite aerial photography rig.  I ended up getting some shots over Dog Beach, which turned out to be fairly difficult since the wind was not cooperating.  To get the camera high enough to take usable photos, we ended up putting the kite in the air and then walking down the beach to add enough lift to raise the camera into the air.  The scene you see here captured a homeless man’s campsite situated in the dunes.  There were a few of them there, although no one was at them when the kite was taking photos.


Aerial Photography: Coal Oil Point Reserve

Recently presented with the opportunity to submit a photograph to an ecological photo contest on campus, I’ve been working on cleaning up some of my recent balloon and kite aerial photography flights.  This photo was taken with my hot air balloon rig, which is now retired due to safety concerns, a few years ago at Sands Beach at Coal Oil Point Reserve near Santa Barbara.  The site is one of the University of California’s Natural Reserves.  I’ve stitched a bunch of photos together using Hugin, then cropped the scene down to a rectangle so it would be more pleasing to the audience.  The jagged edges of the stitched scene would probably be confusing to many unfamiliar with the process.  The black squares in the image are targets that I made to help georeference the scene, something that has proven to be more difficult than most GIS analysts would suspect.


Robert Tobias Photography Logo

My dad has been a photographer pretty much all of his life and recently has decided to sell his work.  He’s very talented – you can see the attention to detail in his shots – so I really want to support and encourage him.  For Christmas, I surprised him with a tshirt printed with a logo I designed.  The front of the shirt has the large design above and the back has the link to his website, below.  I went with a bit of an 80’s vibe since that’s the trend right now.  The camera image was one I modified from my own research logo, which I’ll share a little later, and was inspired by the simple iconography from the National Park Service.  Of course, I made the design in Inkscape, like any good open source supporting artist (plus you can’t beat the price!).