As we all learn more about how personal data lives online and affects us offline, people are understandably asking questions about how municipal governments are collecting or protecting sensitive data.
The City of Seattle in Washington, which has been recognized for excellence in open data, recently partnered with the Future of Privacy Forum (FPF) to assess how well the city is protecting residents’ privacy as it publishes open data. In January, FPF published a first-of-its-kind open data risk assessment and recommendations about how the city could improve privacy protections.
The report includes questions that every city open data program can learn from and act upon. To learn more about how Seattle is meeting its open government commitment while managing privacy risks, Sunlight Foundaton’s Open Cities team interviewed David Doyle, the City of Seattle’s open data program manager. The following transcript has been lightly edited for clarity.
Katya Abazajian: Openness and transparency are one way to build trust with residents. Protecting privacy is another. How do those two forces balance in your work?
David Doyle: In Seattle, openness and privacy are pretty closely linked. The open data team and the privacy team are co-located within the IT department and we both report to the chief technology officer. Both programs have had a lot of support from city leadership and it was actually the privacy efforts that started first, before the city had a formal open data program.
KA: How do openness and privacy currently interact in the work of your team?
DD: Our operating philosophy is that no dataset goes on to the platform without a privacy review. My team, the open data team, reviews datasets for quality, but before we can publish them, our chief privacy officer and that team review the datasets for privacy concerns.
I came from an engineering background, so I think a lot about how to streamline processes. We want to be able to continue have data flowing on the platform with as little friction as possible while also ensuring that we’re mitigating risks in privacy and quality. So, one of the first projects here was building some internal infrastructure to help manage the flow of open datasets internally. Whenever departments submitted a dataset for review, they go to a tool and identify their dataset, as well as a number of required metadata. That triggers a notification to the privacy team and my team to pick up those datasets and review them. Once we have checked the box and made sure everything was good — or realized that we had to go back and have some conversations with that department — then the dataset automatically goes from private to public on the platform.
Another change is that we now look at the datasets in their totality. Previously, the team had been looking at snippets of the dataset. Now, we look at the whole thing. That change helped give people the sense of the scope and scale of the work. The Modern Open Data Benefit Risk Analysis tool gives us another set of criteria with which to review datasets, especially from the privacy perspective. We are looking at that and trying to figure out how to work that into our existing workflows.
Alex Dodds: Why did the city decide to do a privacy assessment?
DD: Our open data policy proscribes a risk assessment every year. In 2016, we decided that privacy would be the risk assessment we would do do for that year. As I understand it, it was the first of its kind in the U.S. We’re really proud to make it available for everyone to pick up and read.
I think the mandate to do a risk assessment every year is really valuable because technology and data science are changing so fast. Having a yearly cadence where we step back and ask, ‘Okay, what kind of risk assessment should we do this year?’ is really helpful. And privacy is one type of risk, but it’s not the only one. There’s also the quality of our platform, the data we’re putting out, the experience residents are having with the portal or with the data itself. We need to think about all of those things.
What’s helpful about the FPF report is that it makes really clear the concepts of both risks and benefits. It’s not all just “risk risk risk.” There are really really good reasons why we want to open data, and we need to be able to articulate those as well as the potential risks associated with the data.
Similarly, the implementation of the FPF report is going to be quite a lot of work over several years. It’s not a short-term thing. So right now, we’re looking at, ‘Well, which part of this can we implement this year?’
KA: One of the report’s recommendations was for community members to be more involved in the city’s privacy work. Are you planning to take those recommendations?
DD: It’s something we’re looking in to. One of the questions we’re asking is, would we need to ask residents about every single dataset? I mentioned earlier that we’re trying to reduce friction in the system and enable staff to publish data. We are committed to keeping data flowing on the platform. So we’re working to figure out how to do things like community engagement and collecting community input while still allowing city data to be available.
AD: Did FPF’s privacy assessment itself include community members?
DD: We have a Community Technology Advisory Board. They’re really the best vehicle for us to get community feedback. FPF joined their meetings early on to present their research methodology and discuss what they were planning to do, and I joined their meetings closer to the end of the project to give further updates. In August, we presented them with the draft report and took that opportunity to get some feedback and help shape the final product. We’ve tried to be really intentional about seeking that kind of feedback. We’re hoping the Community Technology Advisory Board will also be a way for us to start figuring out what datasets will be sensitive enough to require that public discussion, versus what would be be okay for us to go ahead and publish without community involvement.
KA: What are some of the ways you’re currently making your work open to the public?
DD: We publish annual reports about our performance over the last year, as well as our plans for the year ahead. I speak at local colleges and universities, at the Community Technology Advisory Board, and at our local Code for America brigade, Open Seattle, every few months. I’m proactive about it and try to get out into the the community regularly to share what we’re doing and also to get a pulse on the concerns people have.
If you’re only looking at public disclosure requests or Freedom of Information requests, if you’re just looking at people who are trying to extract information from the city through various mechanisms. you get one sort of sense of what people are concerned about. My experience has been that when you actually go out and talk with a group of people, you get a better understanding of what they’re really thinking about. Those kind of in-person conversations are very helpful because, if you don’t do that, people feel like you’re not being responsive, or you’re sort of having to go through formal mechanisms to get information. So I find that that’s pretty helpful and engenders a lot of trust.
I think then you can supplement that with the annual reports, your open data plan, it opens up avenues for people to have conversations with you.
I love going and speaking at the universities and colleges, in particular, because it gives me a sense of the next generation and where their minds are. They do think differently, that’s for sure. And we have to be prepared for what’s in the future, rather than reacting to what’s today. So it really helps to shape our thinking in terms of proactively what should we be releasing when it comes to data. Right now, we’re releasing tabular datasets but we need to go much further than that.
KA: On the topic of community involvement, I saw that some of the recommendations in the report were specifically around equity and fairness in the city’s data. What are your thoughts about how your data privacy work could improve when it comes to equity?
DD: This was the area where we got the lowest rating, and that fact made us sit up because we thought we had been doing a good job there. We talk about equity a lot, we think about it a lot, it’s a huge priority for our mayor and to see that this was the lowest score made everyone realize, we really need to understand this a little better.
In fall of 2016, we applied the Race and Social Justice Initiative toolkit to our open data program, and that was a great learning experience. As a result of that, we now always have at the back of our mind the question of whether the data we’re producing will have positive outcomes when it comes to equity. As we start to implement the FPF recommendations, we might go back and revisit that toolkit and ask, “Are there things there that we could apply here?” But, ultimately, we’re glad to hear this, because the whole point of doing an independent assessment is to get that external feedback.
KA: Who ultimately decides what data gets published?
DD: Right now, it’s basically up to each department for which data they wish to publish. I don’t really have the ability to tell departments which data they should publish. I just help get the data on to the platform, and make sure we’re doing the right thing in getting it there. In terms of which datasets get published, and the reasons why we would publish one dataset over another – those decisions are siloed right now within each department. I think that’s the case with a lot of major cities. So, maybe the pressure will start growing on “How are you making those decisions? Why aren’t you just releasing more and more and more?”
We want to make it easier for departments to publish data at scale. And we have to be ready to do that. That’s why we’re focusing on operational processes and infrastructure. If we want to be able to start publishing real-time data from the “Internet of things” or devices — which are at much larger scales than we currently do now — we have to make sure the right things have happened early on, just to make sure we can publish data appropriately on an ongoing basis. So these questions like who decides what goes out become more and more interesting.
AD: What advice would you have for other cities who might also be thinking about these types of questions?
One of the interesting recommendations in the report is to keep the open data staff and privacy staff separate. We partner together a lot, so we started thinking. None of our positions have a funding mandate, so we had started thinking about whether the positions should be combined. The report points out, though, that it’s good to have more people thinking about these things, not less. Having separate staff looking at privacy versus open data is a good thing, because then you don’t have a single fatal point of failure. You have several people who are able to have a conversation about a particular dataset or project, and that’s a really good thing.
AD: Do you have ideas or suggestions for cities that might not have the same resources as Seattle?
DD: We have a lot of smaller governments reach out to us already. Usually, what I say to them is that, if at all possible, it’s good to have a separation between privacy staff and open data staff. I think having someone within your government who is focused primarily on ideas of privacy and risk separate from the people who publish open data is something you should look to have, no matter how large your organization is.
The other thing is that we are going to strive to make Seattle’s implementation process as public as possible. We’ve already published the findings of the report. I’m hoping in a few months, we’ll also be able to say, ‘here’s what we did’ or ‘here’s what we plan to do’, so that other cities can see how we took a lot of big recommendations and honed them down to a list of actionable steps. Hopefully, us being transparent and sharing as much as we can will help other cities as well.