FINRA’s R&D Program: Exploring the Future of Advanced Analytics
FINRA's Research and Development Program is using advanced analytics to change the way FINRA performs its essential regulatory functions. Feeding off FINRA’s culture of innovation, the R&D program is designed to be quick and agile, hoping to find transformative new technologies, but willing to fail and fail fast when an idea doesn't work out.
On this episode, we hear from FINRA Technology’s Ivy Ho and Greg Wolff how the R&D program is pioneering the future of how FINRA fulfills its mission.
Resources mentioned in this episode:
Listen and subscribe to our podcast on Apple Podcasts, Google Play, Spotify or where ever you listen to your podcasts. Below is a transcript of the episode. Transcripts are generated using a combination of speech recognition software and human editors and may contain errors. Please check the corresponding audio before quoting in print.
00:02 - 00:28
Kaitlyn Kiernan: FINRA's Research and Development Program is using advanced analytics to change the way FINRA performs its essential regulatory functions. Feeding off the culture of innovation, it is designed to be quick and agile, hoping to find transformative new technologies, but willing to fail and fail fast when an idea doesn't work out. On this episode, we hear how the R&D program is pioneering the future of how FINRA fulfills its mission.
00:28 – 00:37
00:37 - 01:01
Kaitlyn Kiernan: Welcome to FINRA Unscripted. From Hoboken, New Jersey, I'm your host Kaitlyn Kiernan. Today, we've got one new guest and one returning guest with us. We've got Ivy Ho, Vice President with FINRA Technology, who last joined us on episode 36 about the Digital Experience Transformation. And then our new guest is Greg Wolff, FINRA Technology's Enterprise Software Architect. Greg and Ivy, thanks for joining us!
01:02 - 01:03
Greg Wolff: Thank you for having us.
01:03 - 01:05
Ivy Ho: Thank you for having me.
01:05 - 01:18
Kaitlyn Kiernan: So together Greg and Ivy spearhead FINRA's Research and Development program, which is what they're here to talk with us about. So, Ivy, just to get started at the highest level, what is FINRA's is R&D program and when did it start?
01:19 - 02:26
Ivy Ho: Okay, so it is an R&D program, a research and development program for advanced analytics. And let me first define what we meant by advanced analytics. It is a term that we use internally for disruptive technologies like big data, artificial intelligence, machine learning and natural language processing. So, we recognize that advanced analytics is changing the way we're going to perform our regulatory functions. So, we envision a future where advanced analytics can automate many of the tedious and repetitive tasks so that our staff can work on deeper issues.
So, recognizing the disruptive and transformative potentials of this type of technology, FINRA launched the analytics R&D program at the beginning of 2019 and our goal is to explore the use of innovative advanced analytics techniques to solve problems and address opportunities. And the goal of the program is also to build a culture of innovation. We want to explore, research and prototype many of the innovative ideas using leading edge technologies to empower our staff and to maintain our position as a leading self-regulatory organization.
02:26 - 02:33
Kaitlyn Kiernan: And of course, innovation is one of FINRA's core values which we've also talked about on a previous podcast.
02:33 - 02:35
Ivy Ho: Absolutely.
02:35 - 02:40
Kaitlyn Kiernan: And so how is the R&D program structured? Who's included in this program?
02:40 - 03:54
Ivy Ho: So, it is a mechanism for us to conduct lots of fast and cheap experiments outside of conventional projects. So, the governance for the R&D program is very lightweight, is very agile, and it is for us to explore opportunities with our business partners in small teams. So, it's got a multi-tier governance structure. The executive leadership team which consists of members of FINRA's Management Committee sets the strategic direction for the program. Then there's a 16-member leadership team consisting of leaders from all business units, including technology, and this team champions the various analytic R&D projects. This team reviews and selects and sets priorities for the projects.
It also helps us to identify the business sponsors, articulate the business value and then also help us recruit business subject matter expertise to participate in the programs. Gregg and I lead the program and we're supported by a team of coaches. The coaches our senior technologists and data scientists who have a lot of advanced analytics expertise. And together we provide technical guidance to the projects. Coaches typically also serve as project leads for the projects.
03:54 - 04:00
Kaitlyn Kiernan: Okay, and how often do you and Greg and the senior leadership management committee meet for This?
04:01 - 04:46
Ivy Ho: So, we meet with the ELT, the executive leadership team, at least on a quarterly basis. And then the leadership team meets as often as necessary. Typically, when a project is written up, we present it to the leadership team for their review, feedback and also the ratings. And at the end of each project we also have what we call findings presentations it's discovery that we did. So, we present the recommendations of what we learned, a prototype that we built to the leadership team. So, we meet at least, I would say, every other month and as often as needed. And then the coaches and Greg and I meet on a weekly basis to set the priority, review their submitted ideas. We'll talk about the process a little bit later, but we also fine tune the project to get it ready.
04:47 - 04:54
Kaitlyn Kiernan: So where do the ideas for these projects actually come from? How do they become part of the R&D program?
04:54 - 06:52
Greg Wolff: Ideally the best ideas come from the business staff who say, "look I've got this problem. Is this possible? Is this even possible?" The more radical and transformative and innovative it is, the better, really, because as Ivy pointed out, this is about research, about pushing the envelope for us, about making our capability better, which really is learning. And it's not that the people in IT, who are really big in data science, they know how to implement. It's the people that are doing the work that have the ideas around, "wow if we could just do this thing," then they come up they write it down and then we have one or more of the coaches work with them to flesh it out to build it up to actually write it down.
We all go through it to try and enhance it. And then when we've got an actual write up that embodies that idea and a businessperson who's going to sponsor it, then we will present it, as Ivy said, to our leadership team and it gets rated. One of the primary reading criteria is how transformative is it. Is it really pushing the envelope for us? If it's just one more thing we could have done in a project, I mean, that's not research. If, on the other hand, it's something like how might we actually use one of the new language models to look at the quality of Market Reg data as it's submitted. It turns out that idea has turned into a really powerful set of concepts and tools in our data receipt pipeline, so that we can know before we have a problem in processing that something's wrong with the data. And we get so much market data that ideas like that are impactful. So, ideas come from everyone hopefully.
06:53 - 06:59
Kaitlyn Kiernan: Billions of market events a day it makes sense that if you can catch the issues early that makes a huge difference.
06:59 - 07:00
Greg Wolff: Right.
07:00 - 07:35
Ivy Ho: There's also a great source of ideas from our internal annual hackathon. So FINRA has an internal two-and-a-half-day hackathon that we call the Createathon. The Createathon is a great source of ideas also that we use to seed the R&D pipeline, because during the Createathon over 50 teams are formed and more than 50 ideas come out of two and a half day Createathon and then we use it as seed for our pipeline. So, we want to take ideas from the two and a half days of work and evolve it into something that is more substantial. So is a great source of ideas for our pipeline.
07:35 - 07:51
Greg Wolff: The best ideas have come from groups of just the end user business staff who said "it'd be really cool if we could do this," and then they form a team, the Createathon happens and then we've gotten the best ideas from there.
07:51 - 08:00
Kaitlyn Kiernan: So, you need people from the businesses to tell you where the pain points, where is the struggle, and then you need folks and technology to help them figure out is it possible.
08:00 - 08:33
Greg Wolff: Right. Now we've gotten some very good ideas from Tech folks as well. It's not limited there, but of course the Tech folks are thinking about this technology and can we research that and prove that it's actually viable for us to use. Whereas the really impactful ideas are this is something that would make all our jobs better on the regulatory side. That has the potential to actually change how we work. And changing how we work and getting better, that's what makes us a better regulator.
08:34 - 08:44
Kaitlyn Kiernan: And we actually had a podcast last September, episode 43 about the Createathon and we talked to the winning team from last year's Createathon. Is their project on the pipeline?
08:45 - 09:15
Greg Wolff: Ideas from that project have ended up in several of our pipeline projects. The exact idea in the Createathon doesn't typically end up being exactly what this R&D project is. Generally, what happens is, what is the most difficult hardest problem part of that idea that we really need focused attention on in order to enable actually doing the idea. And so that's typically what we try to attack here--the hard things.
09:16 - 09:48
Ivy Ho: And we typically also combine multiple Createathon projects of similar themes into an R&D project because it is exploring a very generic capability. So, the Createathon teams might have one specific use case for a specific business unit. But R&D projects typically support multiple business units. It is more the generic capabilities that we need to build out to support many business use-cases across FINRA. So that's why typically multiple Createathon projects in a single R&D project.
09:49 - 10:52
Greg Wolff: So, one, as I mentioned before, to give you the spectrum of ideas, I mentioned the language model we're using to look at the Market Reg data as it comes in the door. That's very specific. On the other end of the spectrum, the project we call Decipher. Well it had a bunch of natural language processing to find names and text. It had research there. Then research in how do we actually display a graph and interact with a graph for the user. Then research into actual graph database and processing of a graph. There were like five things in that project that all came together to make that project as successful as it was. But, of course, that means you're doing so many things. So, it's a spectrum. That one had the most things in it. Others, like the language processing of input market data, which might seem counterintuitive because it's just market events, but still that one thing has turned out to be really important for data quality.
10:53 - 10:57
Kaitlyn Kiernan: And how many ideas are generally in the pipeline at any given time?
10:57 - 11:43
Ivy Ho: Since the launch of the program in the beginning of 2019, we have approved and funded 22 projects. Twelve of them have completed and 10 out of the 22 are still in progress. We have a big backlog and we know we can't do them all. Although we want to, but we can't. So, we have to prioritize in terms of how the proposals align with our strategy. We have at least a dozen or more in our pipeline at any single point in time and our Createathon is coming up in a couple of months and that will expand our backlog even further, so we don't have a lack of ideas in our backlog. The problem is--not the problem, it's a good problem to have--is to prioritize them so that we would focus on the important ones based on our use cases.
11:43 - 11:52
Kaitlyn Kiernan: Definitely a good problem to have. And so, Ivy, I want to talk a little bit of that the process of what happens when an idea is first submitted?
11:52 - 12:00
Ivy Ho: As Greg said, the R&D program has opened submission. Anyone from FINRA can submit an idea and all ideas are welcome and that's why we have a big backlog.
12:01 - 12:02
Greg Wolff: And please do, please do!
12:03 - 13:08
Ivy Ho: Yes. And submitted ideas are guided through the process with attention from one or two coaches to identify one, the business sponsor, and also to turn the idea into an R&D project proposal. And we do have a proposal template that essentially guides the submitter and the coaches through articulating the business value of the idea, the transformative nature of the idea and how it applies to the FINRA use cases, and to clearly define the scope and the objective for the project, because it is a project.
The coaches then assess the idea to ensure that the stated objectives are achievable. And together a team of coaches also perform a quick what we call market analysis to assess the idea versus existing capabilities or available tools in a marketplace. So, if there are tools out there that can answer or solve the problem, we certainly want to explore that. And when the proposal is ready then we present it to the leadership team, and they will rate idea. Ideas scored high by the leadership team will proceed as approved R&D projects.
13:08 - 13:10
Kaitlyn Kiernan: And who are these coaches?
13:10 - 13:17
Ivy Ho: So, they are technologists and data scientists combined. And we have 10 coaches in the pool and Greg and I.
13:18 - 13:21
Greg Wolff: Were pulling them from all of our line of business support staff.
13:22 - 13:28
Kaitlyn Kiernan: So, it's an added part of their job. It's not their job. It's just an added thing that they're doing on?
13:28 - 13:29
Greg Wolff: That's right.
13:29 - 14:09
Ivy Ho: Yeah. R&D, the program does not have dedicated staff by design. So, with the exception of a couple data scientists and senior architects, we stop the projects by pulling people from business units and existing technology delivery teams. And it's because we wanted to expose people into this level of advanced technology and we believe that in addition to the technology and business processes, we really need to evolve the skill of our staff. We want to evolve towards an analytics workforce and the way to do that is by people participating in the Createathon, in R&D projects to give them hands on experience working with the set of technology.
14:10 - 14:18
Kaitlyn Kiernan: And so, project—it goes to that executive leadership team and it’s rated highly and is approved—what happens next?
14:18 - 14:43
Ivy Ho: We formed a team. There's always a business sponsor. Most of the time we have more than one business sponsor per project. We have typically one or two coaches and then the business subject matter experts. And then three to four department staff. It is a small team. They work together anywhere from three to six months. And again, R&D projects are meant to be fast and cheap experiments, so we don't want projects to go on longer than six months.
14:44 - 14:53
Greg Wolff: Generally, all our projects have run three months with two of them running longer because they had value to run longer. So, we let them run longer.
14:53 - 14:59
Kaitlyn Kiernan: Three months is not that much time at all. So, do they need to arrive at the fully finished product at the end of three months?
15:00 - 16:13
Greg Wolff: Oh no, no. This is not about delivering production code. This is about taking risk, about going after something that we don't know how to do. We don't even know whether it's doable. For example, the language research project that I talked about, we did not know whether that was doable. It turned out to be doable. We've had a couple of projects where the end result has been we learned that that particular thing is just not possible with the current tech available, which is good to know because that means we actually prevented a real life business project from getting out over their skis and failing because the tech just wasn't good enough. We did that work in the R&D program. So, we've had projects that deliver success with, "wow that's going to work, and this is how it's going to work," and we handed it off to another team. And the one project I mentioned that that's just not possible. Don't try what we just did because it won't work. Let's change the scope of whatever that business project was that was hopefully going to happen, but now has to happen differently because we know that tech wasn't ready. Both of those are valuable to the company.
16:13 - 17:07
Ivy Ho: So, the deliverables from R&D are proof-of-concept prototypes. It's not software products. It's reference implementations to include some software code, because we do write code during those projects. And we also include data that's used to train for example in machine learning model and a very detailed white paper. The white paper is designed to capture the knowledge gained from doing the R&D project. And the outcome from all of this—the white paper, the prototypes—provide a great resource for the software development teams to confidently plan and adopt the new techniques of new technologies, because it's been proven during the R&D process. It also gives them a much higher level of certainty of success as they plan and execute their roadmap. So, yes, R&D is not software product development, but it is really a great resource for the software development teams.
17:07 - 17:20
Kaitlyn Kiernan: It sounds like all of this is really allowing FINRA and FINRA Technology to embrace failure and allow that to happen in order to take the risks necessary to come to some really cool solutions over time.
17:20 - 17:23
Greg Wolff: Right. The saying out in the industry is fail fast.
17:23 - 17:24
Ivy Ho: That's right.
17:24 - 17:26
Greg Wolff: Or as I like to say do the hardest thing first.
17:26 - 17:27
Ivy Ho: Right.
17:27 - 17:29
Greg Wolff: And then figure out whether you can.
17:29 - 17:55
Ivy Ho: Yeah, the space of analytics tools is very much evolving. So therefore, our goal is really to evaluate the applicability of the latest research breakthroughs. This is really leading-edge technology in a very rapidly evolving ecosystem. And also, to test the feasibility of FINRA's use cases against the tools, so that we can make investments, or not, adopting any of the new tools.
17:56 - 18:04
Kaitlyn Kiernan: So, Ivy, you mentioned there have been 22 projects and 12 have been completed. What's the status of the completed projects?
18:04 - 18:10
Ivy Ho: Some complete the projects have move on to an implementation in real production applications.
18:10 - 18:20
Kaitlyn Kiernan: Greg, you mentioned all this work with A.I., big data and natural learning processing. These are concepts that are hard for me to understand. Why is FINRA pursuing this technology?
18:20 – 21:45
Greg Wolff: So A.I. is an acronym that out in the industry means artificial intelligence. But I like to change that acronym here for FINRA. We're not looking here at FINRA for artificial intelligence. We're looking for something that I like to call assistive intelligence. That is figure out how to build a data-driven model, which is what machine learning is. So, digress. Take a bunch of data and you use that data to build a machine learning model, not the machines learn anything, but you quote learn from the past history, and you build a function that says, well I've seen all of this data and if I get this next data, let's say, for example, a complaint. We get a nice detailed complaint from somebody. And imagine a model that has looked at the hundred and fifty thousand complaints that we've examined and cases and says, "Aha! This complaint is very likely to be real. Examiners look at this one."
That means all the noise that the examiners have to paw through, maybe they can focus right in on the important stuff. And that would save them time and effort. And then, imagine a model that found the key names of people, places and things and just pulled them right out of the text. So now you're in a cycle exam and you're having to read hours of reading through documents the firm has given you. Well, maybe you hand the documents to this model, this machine learning model, and it finds all the names. And not only that, that would allow us to actually match up the names to everything in CRD and elsewhere in our databases.
So, when we talk about advanced analytics it's really two things. One, help the end user do data discovery to analyze the data, to see patterns in the data, and, two, to actually apply a pre-built model to do things that we're confident the machine can do, because the machines are not intelligent. They're really stupid. But we can build software that will make it easier for the user. For example, the Examiner I'm thinking about, to actually do the work, to get rid of the drudgery to find all the names rather than you know yellow highlighting printed documents, which used to be what people had to do. I could imagine a case where we open the document with all the highlights already in it.
That's what you can do with this A.I. Don't think of it as artificial taking people's jobs. For us it's not about that. It's about assisting the people, making everybody more efficient. Making it easier to do your job, to get to the point of the matter, because an examiner will spend a lot of time pawing around through stuff at the firm. If we can give them tools that take hours of drudgery out of it and get right to the heart of the matter, that's incredible. That would really help. So you look at your historical data, you figure out the patterns of the data, all sorts of math and tech is involved there, and then you build a function that helps say, "Oh I'm looking at a new piece of data, I see these things in it that are important to you, because of what we know from history." Oversimplified, but that's what it is. Data driven functions.
21:46 – 21:57
Kaitlyn Kiernan: It makes sense. I think it's Moore's Law, where the amount of data out there is doubling every two years and more data is it better if you can't process it and if it just muddies the picture.
21:57 – 21:58
Greg Wolff: Exactly.
21:59 - 22:11
Kaitlyn Kiernan: Greg, you talked about a couple examples of stuff that's currently in the pipeline or that has gone through the pipeline that's going to be implemented. Are there any other examples of ideas you're working on?
22:11 - 23:23
Greg Wolff: The language model I mentioned. Nobody works with data, Market Reg data at the scale we work with. Nobody else had ever tried to apply these language models. That one was one of the few that we've done that was real industry leading research. On the other hand, we still do some other mundane things that are research for us. So, what should our model development lifecycle be? This is brand new to us. We don't know how to do this. And this is so new, as Ivy has mentioning, this stuff is so new in the industry that there isn't really a well-defined model development lifecycle we can just pick up. For our standard software development lifecycle, we've got that waxed. That's based on Agile practices that were developed in the early 2000s. There's nothing like that for managing these models in the data science. So, we expect to get funded a project that's just going to explore and figure out how do we do model development here at FINRA, because we don't know how yet. This is new for us. So, yes there's some real science-y see math stuff and then there's the more mundane, just figure out how to do something. We do both.
23:24 – 23:32
Kaitlyn Kiernan: It can't all be the sexy projects all the time. So just to wrap up, what does success look like for the R&D program?
23:33 - 23:49
Greg Wolff: Innovative ideas get time in the sun to see whether or not we can actually do them. And getting that time in the sun and getting effort on the idea, that alone is success. Then of course, there's we want something hardcore and tangible out of it as well, but...
23:49 – 24:45
Ivy Ho: I'll add to that by saying that success to me is when a group of people coming together to evolve an idea from a not-yet-baked idea into proven concepts and eventually software products, it's collaboration, it's innovation, and as you mentioned, Kaitlyn, which are two of FINRA's values. We've seen tremendous enthusiasm from our business users, because they can envision a future where the advanced analytics technology can automate the tedious, the repetitive tasks, so that they can be more focused on the deeper issues. They don't want to be reading hundreds of pages of documents. They want documents summarized for them so they can do deeper investigations, ask deeper questions, dig deeper into more complex issues. So, it really is freeing up our knowledge worker's time so that it can be more focused on more complex issues.
24:46 - 25:03
Kaitlyn Kiernan: And in further recognition of the importance of data analytics and technology and FINRA's work today on our next episode we're going to talk to Kerry Gendron, who is Member Supervision's data analytics head. So, they have someone on their team outside of Technology focusing on this area.
25:04 - 25:08
Ivy Ho: Yeah, and we partner with that group very closely on many of our projects.
25:09 - 25:12
Greg Wolff: Kerry is the sponsor of one of the projects that's finishing up right now.
25:12 - 25:18
Ivy Ho: Yeah, I think she might have sponsored so far four or five projects already, and she is relatively new at FINRA.
25:18 – 25:48
Kaitlyn Kiernan: Yeah, she is still new, so that's why we want to get to know her. Well, Ivy and Greg, thanks so much for joining us to tell us about FINRA's R&D program. I'm fascinated and looking forward to hearing some of the projects that make their way and are ready for primetime for FINRA. But for listeners, if you don't already subscribe to FINRA Unscripted on Apple Podcast, Spotify or wherever you listen to podcasts. If you have any ideas for future episodes you can email us at [email protected] Until next time
25:48 – 25:54
25:54 – 26:21
Disclaimer: Please note FINRA podcasts are the sole property of FINRA and the information provided is for informational and educational purposes only. The content of the podcast does not constitute any rule amendment or interpretation to such rules. Compliance with any recommended conduct presented does not mean that a firm or person has complied with the full extent of their obligations under FINRA rules, the rules of any other SRO or securities laws. This podcast is provided as is. FINRA and its affiliates are not responsible for any human or mechanical errors or omissions. Parties may not reproduce these podcasts in any form without the express written consent of FINRA.
26:21 – 26:27
Music Fades Out