OpenData has been a hot topic for the past few years, it means many different things to different people, the commercial sector see’s, (should see?), the potential to leverage OpenData for the creation of new products and services which they can sell to make money, e.g AirBnb and Uber both rely on access to other companies spatial data, (maps). The public service sector should be able to use the data to create better public services, and government decision makers should be able to use the data to make informed decisions.
Note the use of words like “use” and “leverage”, both imply the ability to access the data, and more specifically to be able to access the data in a Machine Readable format. As you can imagine the data we are talking about comes from many different sources, and will actually exist in repositories/databases all over the internet, maintained and contributed to by many different actors. The different user groups who need to use the data will most likely do so with their own specialized tools, be it ArcGIS, QGIS, STATA, R etc, so it is critical that their tools can talk directly to the data they need, thus the need for the data to be in a Machine Readable format.
One of the issues, especially in Tanzania, (and probably for many other regions of the world), is who/what is going to seed that initial collection of data AND who/what is going to make sure that the people who could potentially use the data are aware of it and can actually access and use it. Increasingly donor funded projects are facilitating the collection of huge datasets, the Zanzibar Mapping Initiative, is an excellent example of that, where the whole island was mapped using small scale drones. This for the most part by Zanzibari’s for Zanzibar, and the data has been made available as OpenData via the ZMI Geonode, and OpenArialMap. This in itself is a great win, especially as the approach taken was to train Zanzibari students and surveyors so if nothing else it has increased capacity. We here at Uhurulabs rely on our commercial contracts as they allow us to subsidize our public/innovation sector work and for our commercial work we rely on professional human resource, see our pilots, two of which are from Zanzibar and products of the ZMI project.
So in the Data collection has been seeded and done, and it is available as OpenData, how has it been used so far? The data has been used for things like building footprint digitization, we hope to provide a link to that soon. It has also been used by the Zanzibar Commision of Lands for land use and city planning, we also hope to provide links to information about that. Another very exciting use case has been in the Open AI Tanzania Challenge. which invited data scientists to develop feature detection algorithms that can automatically identify buildings and building types using high-resolution aerial imagery. It is our hope and expectation that the classifiers that were developed will be released as OpenSource and as such we have asked on in the Challenge Forum.
Another very exciting initiative is Ramani Huria…
“Ramani Huria is a community-based mapping project that began in Dar es Salaam, Tanzania, training university students and local community members to create highly accurate maps of the most flood-prone areas of the city. As the maps have taken shape – their benefits have multiplied and their potential magnified, now serving as foundational tools for development within all socio-economic spheres beyond flood resilience. The project is supported by funding from the UK Department for International Development through the Tanzania Urban Resilience Programme“
We were lucky enough to be involved in this project, most of our work involved collected data using drones. It always seemed that to some degree the drones stole away some of the attention from the _real_ work that was going on, on the ground. Everyone was very impressed with the quality, speed and accuracy that drones were able to capture high quality aerial images, but the real rich, actionable data came from the boots on the ground. The data includes things such as if a particular house has been flooded, and if so to what height, if a structure is public or private etc. What is really special about this data is its OpenData, the collection of which has been funded by public money, and done for the most part by specially trained university students and now available for free for public servants to make informed decisions, and business to create new services from.
In the maps below you will see the standard OpenStreetMap basemap, all the data from Ramani Huria was contributed to OpenStreetMap. However you will notice more detailed layers on top that give you more information and allow a deeper zoom level. You can then use the feature info tool (i), to get detailed information on the various assets, e.g clicking on a building will give you information such as if it is residential or not.
It is not however just donor lead projects that are releasing OpenData. Tanzania joined the Open Government Partnership, (OGP) in 2011.
Tanzania declared its intention to join OGP during the launching meeting in September 2011, one of six in Africa that qualified to be involved in the OGP.
Make the data available – in bulk and in a useful format. You may also wish to consider alternative ways of making it available such as via an API.
Make it discoverable – post on the web and perhaps organize a central catalog to list your open datasets.
Note the words in RED, again these words get thrown around quite a lot but what do they mean in this context? 99% of the people who use an API will never know they are using it, it simply means that the data is stored in such a way that the software YOU use to get your job done can access it, without you needing to know the technical details of HOW its being accessed. An example of this is when you go to google maps and search for a location, and then tell the app to give you directions to there, you don’t need to know the technical details of HOW that is happening, or how it know’s that there is high traffic on a certain route. You simply want to enter a location and click a button, and it is the fact that many systems make their data available via an API, that allows it to be possible.
What does discoverable,mean in this context? Simply that a user or system is able to find the data when it needs it. This can be as simple as a search box on a website, or more complex e.g system that pushes you information based on your location and previously identified preferences.
Which now actually brings us the main point behind this post, yes it is true that Tanzania has made great strides in its ability to collect, valuable geospatial and other data. It has also made good progress in its ability to then make that collected data Open. Where I think it is now struggling is in the ability to store, manage and reliably make that data available in easy, useful, discoverable systems using things such as API’s and well crafted interfaces that can be accessed by all stakeholders. This however is a struggle that is not Tanzania’s alone, the whole world is seeing an explosion in the availability of data, it is coming from all kinds of sources, drones, new low orbiting satellites, weather stations, from government, private sector, individual citizens etc. At the same time there is an increasing awareness that good decision making comes from the utilization of good data. Last month we were invited to speak at the Smart Land Administration forum in Finland, where the topic of the day was Spatial Data Infrastructures, (SDI),
A spatial data infrastructure (SDI) is a data infrastructure implementing a framework of geographic data, metadata, users and tools that are interactively connected in order to use spatial data in an efficient and flexible way. Another definition is “the technology, policies, standards, human resources, and related activities necessary to acquire, process, distribute, use, maintain, and preserve spatial data”.[1]
A further definition is given in Kuhn (2005):[2] “An SDI is a coordinated series of agreements on technology standards, institutional arrangements, and policies that enable the discovery and use of geospatial information by users and for purposes other than those it was created for.”
Note that from the above definition it is clear that the weight is on the institutional policies, standards, and agreements which need to exist. Without good strong institutional policies, standards and agreements an SDI or anything like an SDI cannot exist and as such the public will NEVER realize the true potential of the Data revolution. Data will continue to be collected as the market understands it has value, but rather than a tool to liberate the general citizen, it will be a tool to control and oppress.
Keep going to the government websites I mentioned above, use the data, and when you have questions about it ask the relevant body which is very often NBS, the reality is for the most part people in Government are their and willing to help. It is only by the use of such data do people start to understand its value and are motivated to make sure it is kept up to date and available. If you find a website or service to be down then inform the owner.
At Uhurulabs we are committed to servicing and helping anyone who wants to use data and technology for the betterment of Tanzania and its people, as such we are committed to maintaining a number of services, the first of which is now live and is a Geonode that is available for anyone to host geospatial data, and for anyone who wants to access the data.
https://geonode.uhurulabs.org
Our resources are limited and as such we will try to scale the service according to demand, if you experience any problems please let us know by emailing to geonode@uhurulabs.org. For the most part what we will be trying to do is aggregate datasets that we find in other places, especially
data that is not available via API
data that we believe is at risk of being lost
data we have collected ourselves.
The second is one which came to mind while writing this article, and it will be a page that monitors the various websites and services that provide access to data on Tanzania.
I am probably going to get allot of flack for this post but I have been thinking this since I was about 30 mins into the movie.
First of all I almost never go to the movie theater, prior to going to see Black Panther I had not been to the movies in over 3 years.
However both my wife and I thought that this could be a really good movie that we could both enjoy.
Onto the point however, why I think that the movie was a huge disapointment and such a missed opportunity, I will keep it brief and to the point …
The premise that a nation of Africans would exist in Africa and choose not to get involved or intervene in anyway as they see there fellow African brothers and sisters being used and abused by much of the rest of the world. Fine this is not something that is highlighted in the movie, i.e we do not see the Wakanda during the period of colonization and slavery but it is something that for sure came to my mind when thinking about the backstory. I must also say that this is something that I could have looked past and “ignored”, creative license and all that, if it was not for the rest of the issues with the movie.
The fact that the story the movie tries to sell is that the concept of “helping the rest of the world”, came from a Wakandan who had been living in the states, who was then killed by his own brother the king, (hope I am getting that right). Then years later its the son of the slain brother who returns to Wakanda, (as the villain), and single handedly manages to overthrow the current king and is intent on using the Wakanda technology to make the Wakanda an active dominant world power. Ofcourse the villain never wins and the rightful king gets power again. Such a missed opportunity, why not make the cousins, (i.e the rightful king and the “villain”), end up uniting and together have Wakanda take its rightful place in the world and leverage its technology in a positive way to help the world?
So we have morally questionable Wakanda’s who have allowed their brothers and sisters in neighbouring countries suffer without intervening, I say “morally questionable” because in my opinion standing by and allowing your fellow humans to be abused and suffer when you have the means to help is “morally questionable”. And the villain who manages to take control, but then as all good stories go looses to the rightful king in the end. These events however were seemingly enough to wake up the Wakandan people to the need to help their brothers and sisters in the world who are suffering. So where do they go to help?…. inner city USA!
UAV data has been collected over a large area +1,500 sq/km. The UAV’s used were small and as such the area was divided into 200+ zones each of which were then processed into individual Geotiffs. The data was collected without absolute accuracy and as so although the data within a given zone is relatively accurate there are varying degrees of edge matching issues when attempting to put all 200+ zones together.
Attempted solution…
A post processing process that would attempt to take individual zones and automatically do adaptive filtering of zones and then attempt to match using edge matching.
Test Data…
A 27 Zone section was selected as the same dataset.
Preliminary Result…
3km x 3km Zone.
pre
post
at a glance it looks good, so lets look a little closer…
Sustainia just launched their 4th Global Opportunity Report together with DNV GL and the United Nations Global Compact. Over the years, the Global Opportunity Reports have proved that no challenge or risk is too big for business to tackle, and that there is market potential in every one of the 17 Sustainable Development Goals, from smart farming eliminating hunger to solar micro-grids providing clean energy.
All market opportunities accumulated in the Global Opportunity Reports are now available on our innovation hub Global Opportunity Explorer for everyone to explore. But which market has the biggest positive impact on people, planet and profit?
So for a couple of days last week I went back to school. A few of us from africanDRONE, namely Unequal Scenes, Microdrone, African Defence Review and of course Uhurulabs, were sponsored by ICFJ to attend a three day intensive Drone Journalism School at the University of Oregon, in Portland.
The conference was organized by Google News Labs, the University of Nebraska School of Journalism and Poynter. It was a very interesting experience as for the most part my work with drones has been for data collection, survey etc, while most in the room were focusing on using drones for video and photography. One thing we all did have in common however was an overwhelming understanding that the rapid commercialization of consumer drones has changed the realm of what’s possible and by whom in ways we could not have thought possible five years ago.
A main focus of the conference was the training for the American FAA Regulations Part 107, which are the new rules for non-hobbyist small unmanned aircraft (UAS) operations in the USA. The conference had several great tutors who took everyone through all the aspects of the regulations in preparation for the attendees to take the exam.
Regulations are the topic of the day, and of great concern to anyone who wishes to operate UAS commercially. All over the world the various regulatory bodies are struggling to create rules and an environment that allow for the safe and fair usage of our skies. A colleague from africanDRONE has said that he thinks the American approach is “pragmatic and economically advantageous“, this of-course is relative to the South African regulations where it costs up-to 10,000$ to be fully registered and certified to operate UAS commercially. Contrast that with Tanzania where currently there is no payment required for the registration of UAS, there are only procedures you are required to follow with the Tanzanian Civil Aviation Authority (TCAA), and the Ministry of Defence (MoD). I personally have done this a number of times and have found the process straight forward. The one thing I will mention is that until now you have been required to do this for every flight, which can become a burden. I am currently in the process of getting a more permissive permit, that perhaps will only require me to log a flight plan whenever I want to fly, fingers crossed on that!
Lots of people know me as the Drone guy, what many don’t know is that I don’t think I am actually a very good drone pilot, I almost never fly for photography or video and much of my work is automated, where all flights are planned and programmed in the office. I rarely get my hands on the latest consumer drones and even when my friends offer me to try theirs I say no, as I don’t want to crash a $6,000 Inspire! Luckily for me one of the conference sponsors DJI, had brought a number of their latest drones and took us all out for some test flights. The inspire 2 is awesome, but would be a waste in my hands, happy though that I had a chance to finally fly it!
Although I have been using KVM, (Linux Based Kernel Virtualization), on my own infrastructure for some time, it is only recently, (past 4 years), that I have comfortable enough with it to use it in production for my clients, especially when other admins would be involved with administration of the systems. A choice I often made was to use Citrix XEN, as it…
Has a commercially supported version should my client decide to stop using me.
Has a fully free and open version which for the most part is not cobbled, at least it has all the core functionality that my clients need.
One thing that always frustrated me about it was the fact I needed to use a Windows machine to use access its management interface, yes there is a commmunity driven linux version but never worked quite well.
Anyway one of the clients where I had used Citrix XEN, is now ready for an upgrade and I found myself in a situation where I needed to move virtual machines from XEN to KVM. It took quite a bit of digging and experimenting but this is what I came up with.
First of all you need to gather three pieces of info:
NETWORK-UUID is eb57b4d8-7656-c607-b1cc-93cfd8766afe
We now need to tell the XEN Server to put the machine in “transfer” mode, NOTE that this will make the machine unavailable for the duration of the export process.
Now we have the URL: http://47b5ebf70160c752:f2b25a5854c12b76@192.168.13.88:80/vdi_uuid_2713e717-8847-4d1c-abb8-4725a0ce1d88, we can now use curl to get the image
I have been noticing with more and more concern the increasing amount of what looks to me as VERY young police officers on the streets carry what look to me like AF-47, (I don’t know much about guns!!!). When I have seen what feel to me like “gangs” of them on street corners I have honestly asked myself if I feel more or less comfortable. Fact is they make me nervous as hell, and part of me has been comforted by the fact that I figured that probably most of the guns are none functional or not loaded. I came across a video yesterday that confirmed that I should be very afraid, AND that my hope that the guns are not loaded or not working was crushed.. indeed they are loaded, they do work, and unfortunately looks like the police are somewhat lacking in their training.
The idea was to merge into 1 geotiff 28 irregular shaped geotiffs which ranged in size from 1g up to 13g and making up a total of 113g. Why? Because I then needed to cut the resulting geotiff into multiple irregular shaped individual geotiff’s. This took a couple of days, and I was quite keen to come home today as this morning it was at 80%, you can only imagine my disappointment when I checked and found…
0...10...20...30...40...50...60...70...80...90..Traceback (most recent call last): File "/usr/bin/gdal_merge.py", line 540, in <module> sys.exit(main()) File "/usr/bin/gdal_merge.py", line 526, in main fi.copy_into( t_fh, band, band, nodata ) File "/usr/bin/gdal_merge.py", line 270, in copy_into nodata_arg )File "/usr/bin/gdal_merge.py", line 63, in raster_copy nodata ) File "/usr/bin/gdal_merge.py", line 105, in raster_copy_with_nodata nodata_test = Numeric.equal(data_src,nodata)MemoryError
This as you can imagine was very frustrating, and I just assumed that the output was junk but figured what the hell, its created a 301G file lets see what it is. Started the process to load into Qgis, and since I figured it would take a while started this blog post. It seems though that output might be useful as qgis eventually loaded the file, and it actually looks like what I expected. So first things first I have set qgis to now save the file under a new name… Its going to take a while, currently at 12.2G and I actually expect it to end up bigger than the initial 301G
… a day or so later …
So it turns out the merge worked, I will revisit why the error later, but the file saved by qgis had the exact same size, and has the same result using gdalinfo. AND i have just done my first clip BUT it seems I made a mistake the command I used was
We have been doing a lot of work with drones over the past couple of years. Much of it proof of concept for the use of small, under 1Kg drones to capture aerial imagery as alternative to both manned aircraft and satellite, the premise being deploying small drones is much easier and cheaper than the alternative and that the acquired data would be of equal if not superior quality. For the most part we have used senseFly ebee drones, and done our post processing work in Pix4D Mapper, both being professional grade survey hardware and software. There are many possible outputs, but one of the main ones is a Orthophoto, according to Wikipedia
An orthophoto,orthophotograph or orthoimage is an aerial photograph or image geometrically corrected (“orthorectified”) such that the scale is uniform: the photo has the same lack of distortion as a map. Unlike an uncorrected aerial photograph, an orthophotograph can be used to measure true distances, because it is an accurate representation of the Earth’s surface, having been adjusted for topographic relief,[1]lens distortion, and camera tilt.
The format of the orthophoto as output from Pix4D is GeoTiff and using gdalinfo we can extract the following info:
This is for a sample 2 Gigabyte GeoTIFF. Now this might not sound too big but this is for an area of about 1.5sq/km. As you can imagine when you’re doing projects that are in the 100’s if not 1000’s of sq/km you will quickly need huge amounts of storage space. And that is not the only problem, as far as I know the most popular open source platform for hosting such data is currently a combination of geonode and geoserver, however after some experimentation I found that serving these images as is was simply not an option it was taking the geonode too long to render them, resulting in an awful user experience.
I looked into many options, and one of the best seemed to be mbtiles, however in the end it was not a suitable .. blog for another day…
After much research online it seemed like the best option was to first compress the GeoTIFF with gdal using gdal_translate. So far the best options I have been able to find are:
“-b 1 -b 2 -b 3”, which specifies that we only want the first three bands, if you refer to the gdalinfo output above you will see that for some reason Pix4D includes an alpha band. We don’t need it, I think! and its critical to only have the three bands or the PHOTOMETRIC flag won’t work. The next option is
“-a_nodata”, which sets the nodata, (transparency), value to 0 allowing the generated file to have transparency.
COMPRESS=JPEG is quite self-explanatory this is the instruction to gdal to use JPEG compression. This is better than the
PHOTOMETRIC=YCBCR, this changes the used color space from RGB to YCbCr, from what I have been able to find out. (thanks google), the eye is more sensitive to changes in luminance (Y, brightness) than to changes in chroma (Cb, Cr, color). Thus, it is possible to erase some chroma information while retaining image quality, thus allowing for better compression without any visible loss in image quality.
TILED=Yes again is quite self explanatory, it stores the image data in a tiled format which makes for a much better user experience when viewing the data using something like geonode or QGIS.
151M Mar 28 17:54 oysterbay_JPEG_YCBCR.tif
2.0G Mar 2 07:49 oysterbay.tif
As you can see this has reduced our file from 2Gto 151M this is a 13 fold reduction is size, it took my workstation 39 seconds to do the compression, you can see details of my workstation here.
The next step is to add some zoom scales to the image, this will result in a slight increase in the file size but will result in a much better user experience when zooming into and out of the image.
Note the GeoTIIFF is now compressed using YCbCr JPEG, and uses the YCbCr color space.
Bellow on the left is the section of the original Orthophoto and the right the compressed version, can you see a difference?
Here a section zoomed in even closer, 5 points to anyone who know’s what you’re looking at!