Open city data in Philadelphia: the obstacles and triumphs of the L&I example

A screenshot of a draft of the License to Inspect tool, built by Azavea for PlanPhilly using the new L&I app. Click to enlarge.

A feature story covering the as-yet unreleased Philadelphia Department of Licenses and Inspections API-based online tool ‘License to Inspect,’ its inspiration and hope was published on Technically Philly Monday, a story I reported and wrote during the last couple months.

It is the last major feature of the Transparencity grant project I’ve been leading, and one of the more detailed investigative reports I’ve done in my journalism career. The feature, which details the nearly two-year struggle to go public with a project with internal support, is meant to show the lessons learned and obstacles faced in the hopes that future city agencies can more efficiently release their data publicly for development and citizen use.

Give it a read, for lessons to be taken for any local government. and then find some of what didn’t make it into the piece below.

Below, portions of my reporting and writing that didn’t make it into the final feature:

  • Focusing the effort on getting data out the right way was important for a few reasons, Burns, the L&I commissioner, said: (1) transparency, (2) reducing staff work load on freedom of information requests and (3) “We want to show that the Licenses and Inspections Department is an integral part of the community, by being able to show just how much we do with a small staff and (4) how much this department has improved in recent years, battling its reputation.
  • Lessons learned: have high-level requirements, spend time on meeting them, but set deadlines and stick to them, develop that sense of urgency as it exists in every private workplace.
  • “What Nutter needs to say when people ask about, say, the property tax delinquency problem is, ‘We’re going to fix that by getting all of our records straight and public and transparent,” Cheetham said. “When we do that, we’re going to realize problems we didn’t know we had. No city agencies are cross referenced. BRT data was the best the city had but it can be flawed. The way to make this real is to get this sense that the only way to truly fix problems, like property taxes, is to get all our records publicly shared in a format that can be used, like we’re finally seeing done with L&I.”
  • “If this is viewed as successful, this will shine a lot of good light on taking more time to build a services API. In city government, we immediately default to do a data dump, which means we need to find the fields and each project we’ll do a new one. With five new projects, that’s five new data dumps we have to manage. If we can scale off an API, we only have to regularly maintain that one service.”
The story did well in traffic for a boring data story, featuring a few hundred hits in the first few hours and getting spread and discussed on social media, even receiving a Facebook Like from, yes, the Mayor, despite the story's tough talk.

 

  • “Sometimes it might seem like an API is more of a challenge, but even data sets can present problems with maintenance. This was a chance to connect with Open 311 and make things easier for the future,” said Clinton Johnson.
  • “But the city shuts down over the Christmas holiday. No changes to the city’s servers or services during any extended break when a lot of core staff are around to avoid any big problem. So for a good chunk of December, nothing can happen, like launching and running an API,” said Cheetham.
  • By July and August 2010, Cheetham says, “everyone was asking the right questions still, from database specifications to field details to data accuracy assessments.”
  • Burns and Gupta made clear immediately that they didn’t have the capacity on their own to provide a live data feed, “so they were going to release data as a daily text file export that would get emailed, received and processed to be loaded for release to the public by geo-tagging each record and standardizing them all,” he said. “Everyone involved was moving forward with great enthusiasm and support, without any adversarial in-fighting at all.”
  • “By September 2010, everything was green lighted. Azavea is building the application, and we’re already planning to roll it out in the next few weeks,” Golas said. “But then we just suddenly felt the sea change at the city level.”
  • “Allan [Frank] saw DOT as the natural gatekeeper for the [L&I] project, and it takes time to take something like that over,” says another developer close to a stakeholder.
  • “Data has to be accurate because people will be drawing conclusions from it,” Burns said.
  • “DOT said they’re into this, we could have something done by end of September. We get a specification draft in October, a word document of ‘this is what it’s going to look like,” said Cheetham. They sent a sample of the API, a single data field, and we just had to ask for more. It was just this incremental change. And then, well, in December, the city locks down changes to its systems right before elections and during major vacation time, particularly for effects to the mainframe. It’s about keeping everything secure and stable when people are away. So we get into December when they finish the next draft, but we have to wait until after the holiday lock down. We talked about January 7 as the date, and we got a data stream as planned around then. They asked for feedback, we gave it, we started using it and building the application around it. We found issues in the data and limitations in fields, including that a number of the fields agreed upon by PlanPhilly and L&I were not in there. They said ‘they weren’t in the spec,’ but they were in the spec, so DOT went back to add to that. So then a new development process starts. They’re going to redevelop this with expanded fields. L&I would check in once a week for the status, and we’d get news that the specification is 50 percent revised, now 75 revised, the application is 10 percent done, now 20 percent done. We got another version of the API in July. We turned around a response in 24 hours saying here are the eight things that are still buggy. They’ve acknowledged those problems and that’s where we are.
  • The data appears to be rather accurate, which is always a concern with city records: “we aren’t ground trooping it, but it makes sense and we have faith in it,” says Cheetham, no stranger to data sources. Some batch uploads and a handful of minor inconsistencies exist but “nothing to call this project into question,” he adds.
  • “We had more than a few meetings with L&I early on. Those centered on what information do we want in this application and Azavea saying this is how it would work with these data sets, who would be responsible for any liability, what could be done with the data once it was released, all those specifics,” Golas said. “The technology solution was simple early on, getting a daily data dump of all the fields we requested that were entered into the L&I database system, called Hansen.