89 minutes
SDS 705: Feeding the World with ML-Powered Precision Agriculture
Subscribe on Website, Apple Podcasts, Spotify, Stitcher Radio or TuneIn
Welcome to a thought-provoking exploration of the intersection of agriculture and advanced technology. Join Feroz, Jeremy, and Thomas as they unveil the transformative impact of data science on traditional farming practices. This episode promises insights that redefine the future of agriculture. Let's embark on this enlightening journey.
Thanks to our Sponsors:
Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
About Feroz Sheikh
Feroz is an accomplished CTO, thought leader and an entrepreneur. He has founded and led multiple technology-driven ventures, raised VC funding and took them to successful M&A. He has worked philanthropically, making education accessible to 200 million children in India using digital technologies. He is currently focusing on driving digital transformation across agriculture globally as the CIO and CDO at Syngenta Group, and the Board Chair of AgGateway Consortium. Feroz is a thought leader, shaping industry standards, creating open-source software and ecosystems. Apart from computer science and data science, his interests include Physics and Astronomy. He built a Superconductor when he was 16, built an 8-inch reflector telescope with a team of students, built a robotic arm when he was in high school, and was facilitated by the then Prime Minister of India Mr. PV Narasimha Rao.
About Thomas Jung
Thomas is a visionary digital agriculture leader, pushing the boundaries of plant protection as Syngenta’s Head of IT for R&D. Thomas's passion for technology started when he got his hands on a Cray X-MP 22 supercomputer at the tender age of six. Since then, he's been hooked on tech of all sizes, from fully digitized tractors to in-silico simulation of molecules. With these tools, he is driven to combat the climate crisis by regenerative agriculture. With a strong track record in merger integration and driving transformative change, Thomas knows how to make a real impact in the industry.
About Jeremy Groeteke
Jeremy is an entrepreneurial operational manager with expertise in building and launching new platforms across diverse agricultural segments. He has leadership experience within new business development, territory management, branding strategies, product development, program management, budgeting and industry networking initiatives. He leverages customer centric design to drive strategic planning to develop products, teams, marketing strategies, and tactical implementation of key initiatives. Results achieved through professional team management, diversity, authentic communication, and integrity.
Overview
In a captivating episode that's both inspiring and educational, three pioneers from Syngenta shed light on the transformative power of data science and A.I. in agriculture. Feroz, Jeremy, and Thomas dove deep into the pressing challenges of food security and climate change, underscoring how data-driven techniques can offer groundbreaking solutions. They introduced listeners to the world of computational agronomy, a field that doesn't just amplify crop yields but refines and automates age-old agronomic practices with machine learning, ensuring that every plant thrives.
Generative chemistry took center stage as a groundbreaking avenue to accelerate the discovery of new agricultural compounds. Unlike conventional methods, this innovative approach can conceive compounds beyond human imagination, offering solutions that were previously deemed impossible. As the discussion progressed, the wonders of smart growth chambers came into focus. These aren't just futuristic greenhouses; they represent a vision where machine learning meticulously monitors and assists every plant throughout its lifecycle, from a tiny seed to a bountiful harvest.
The nuances of precision agriculture were also explored, portraying a future where farming evolves into a precise art. Imagine a farm that addresses every plant's unique needs, considering its spatial, temporal, and genetic aspects. It's not science fiction; it's the imminent future of agriculture, as depicted by the experts.
But what truly sets this episode apart is its empowering message for listeners. It isn't just a passive learning experience; it is a call to action. For data enthusiasts and professionals alike, there's an unprecedented opportunity on the horizon: to harness these innovations and make a tangible difference in the world. Whether it's using drones for seeding or crafting novel solutions, the future of farming beckons, and with it, the chance to feed the world through the lens of data science.
In this episode you will learn:
- What is precision agriculture? [09:43]
- What is computational agronomy? [12:30]
- How Syngenta helps growers optimize yields [21:37]
- How to bridge the gap between R&D and out in the real world [33:58]
- What is generative chemistry? [37:52]
- How generative chemistry accelerates the discovery of new compounds [41:55]
- How you could make a big social impact in agriculture with data science [56:22]
- How to go about designing ML models for agriculture [1:00:27]
Items mentioned in this podcast:
- AWS Trainium
- AWS Inferentia
- Modelbit
- Syngenta
- Syngenta Crop Protection and Insilico Medicine Harness
- ACS Med Chem Lett. 2020 Article
- ACS Med Chem Lett. 2023 Article
- Retrosynthesis for Chemistry
- Shoots by Syngenta
- Syngenta Group Ventures
- G Robotic Systems
- Conviron
- Innovation in Agriculture: Biologicals by Syngenta
- Soil Health by Syngenta
- Hamburgers in Paradise by Louise O. Fresco
- Drawing Data with Kids by Gulrez Khan
- The Infinite Game by Simon Sinek
Podcast Transcript
Jon Krohn: 00:00:00
This is episode number 705 with Feroz Sheikh, Jeremy Groeteke and Thomas Jung of Syngenta. Today's episode is brought to you by AWS Cloud Computing Services and by Modelbit for deploying models in seconds.
00:00:18
Welcome to the SuperDataScience podcast, the most listened-to podcast in the data science industry. Each week, we bring you inspiring people and ideas to help you build a successful career in data science. I'm your host, Jon Krohn. Thanks for joining me today. And now let's make the complex simple.
00:00:49
Welcome back to the SuperDataScience podcast. Today we've got another special one for you. This time, it's the first-ever episode with not one, not two, but three guests. All three of our guests hail from Syngenta, one of the world's largest agricultural companies. Based out of Switzerland, Syngenta Group has over 50,000 employees and whopping revenue of over $30 billion annually. We've got them on the show today because Syngenta are AI pioneers who are leveraging machine learning to enable farmers to nourish us with ever-increasing efficiency and simultaneously an ever-smaller climate footprint. Super, super cool. All three guests are senior leaders at Syngenta, and each is at the vanguard of this socially impactful AI transformation.
00:01:29
Feroz Sheikh is Chief Information Officer and Chief Digital Officer at Syngenta Group. Until last year, he was their global head of engineering and data science. Separately from Syngenta, Feroz has made education accessible to 200 million children through digital tech. Our second guest is Jeremy Groeteke. Jeremy is Head of Computational Agronomy wherein he leads the Syngenta group's digital agronomy and data science teams to, for example, define farm-based experimental designs and create planting, spraying, and fertilizer prescriptions to meet the needs of agronomists and farmers. And our third guest is Thomas Jung, who is Syngenta's head of R&D for IT. He's responsible for the digitization of science to protect our plants and planet by leading Syngenta's DevOps, including for in silico biology and chemistry, as well as digital trialing and product development.
00:02:21
Today's episode is mostly high-level, so will be inspiring and educational to hands-on data science practitioners and non-practitioners alike. In the episode, Feroz, Jeremy, and Thomas detail how data science and AI can help us tackle the world's food security and climate change challenges, what computational agronomy is, and how it increases crop yields, how generative chemistry accelerates the discovery of useful new agricultural compounds. How smart growth chambers illustrate today that, on the farms of the future, ML will precisely monitor and assist every plant at every moment from seed to harvest. Finally, we've got ideas for how you yourself can apply ML to help feed the world. All right, you ready for this delicious episode? Let's go.
00:03:08
Got the Syngenta crew here for a SuperDataScience podcast. Thank you all for coming in and joining me from all over the world. Let's go sequentially by the way that the Sun is waking us all up over the course of the day. So, Feroz, let's start with you. Where in the world are you calling in from?
Feroz Sheikh: 00:03:29
Hi, Jon. And thanks for having us. I'm usually based in Switzerland, but today I'm, I'm calling in from Pune in India.
Jon Krohn: 00:03:39
Nice. And then in Switzerland, I think we have Thomas Jung, is that correct?
Thomas Jung: 00:03:43
That's right, Jon, thanks for inviting me. I'm holding position for Feroz here in Basel. Beautiful Switzerland, calling from Europe.
Jon Krohn: 00:03:52
Nice. And we have, for people watching the video version, it looks like Thomas is on this like fake Zoom background, but he actually, he proved to me this is, this is like painted on the wall. It's like the recording studio for Syngenta. So, it's pretty amusing.
Thomas Jung: 00:04:05
It's real, no AI involved.
Jon Krohn: 00:04:11
And, and then our final guest for our record, three guests on the SuperDataScience podcast. I don't think that's ever happened before. It certainly hasn't been the three years that I've been hosting. We've got Jeremy, Jeremy, where are you calling in from?
Jeremy Groeteke: 00:04:24
Hey, Jon. Here in Des Moines, Iowa. So, Central Corn Belt for agriculture. And looking forward to the conversation today.
Jon Krohn: 00:04:32
Agriculture certainly is the theme of the day. And yeah, I had the great pleasure of meeting Feroz and Thomas in Switzerland when I was there in May for the St. Gallen Symposium. We've actually had a few guests on the show that were people that I met at the symposium. So, folks may have noticed that name come up a few times in recent months. But there's also, there's a special connection here because our researcher for the SuperDataScience podcast, Serg Masís, he also works at Syngenta. And so he knows all three of our guests today. And so Serg is in most episodes, he's digging up details on, on our guests and coming up with basically all the best questions that I ask are Serg's idea. And in this case, he was also able to work with his colleagues on putting some questions together. So, there's also that special connection there. So, in terms of those, that research that Serg has prepared, so Feroz, you are the Chief Information Officer, as well as the Chief Digital Officer of an agriculture company Syngenta, with a mission to improve food security and sustainability. Now, agriculture is not necessarily an industry that you would perceive of as data intensive. So, could you set, could you shed some light on the role that data science and machine learning play at Syngenta?
Feroz Sheikh: 00:05:56
Yeah. Agriculture is you can see it as a jigsaw puzzle. That's my, my favorite analogy. That you know, as we start looking at creating the food security for the growing population the yield has to continue to increase. And over the last 40, 50 years we've seen that, you know, advancements in genetics, advancements in chemistry, and advancements in machinery on the farm have you know, made a significant impact. And the next bit of improvement in agriculture would come through the use of applying data and data science. So, it's like the missing piece of the jigsaw puzzle. And when this piece falls in place, we will see a tremendous amount of transformation for agriculture.
Jon Krohn: 00:06:53
I said, like that analogy Feroz, this idea of a jigsaw puzzle and data being the missing piece in there. So, Syngenta in particular is focused on tackling big challenges that our society faces like food security and climate change. Could you speak a bit to how data science machine learning are specifically used to tackle those?
Feroz Sheikh: 00:07:14
As, I mean, we, when you think about the world today the population continues to grow. By 2050, we will be 10 billion people on the planet. But there is no more agricultural land. And there are several research studies and, and surveys that indicate that globally agriculture has reached the peak land available. Any more land would come at the cost of, you know deforestation and things like that, which we as a society don't want. Similarly, you know, the climate change continues to affect us. And if we have to limit global warming to 1.5 degrees Celsius target we need to reduce carbon emissions by 22 gigatons. And nearly 25% of those emissions come from food and agriculture-related activities. So, when you talk about problems, big problems, these are really planet-scale problems we are talking about here.
Jon Krohn: 00:08:21
Wow. Yeah, I wasn't aware of that. That one in one in four units of carbon dioxide emissions come from agriculture. And that's not something that we can, you know, just get rid of. We need to keep eating. So, yeah, this is essential in order for us to be able to all live together. And we're still anticipating for a few decades more that a couple billion more people will come on the planet. So, this efficiency is essential to being able to feed all of us. And then I think food security is critical to, in general, avoiding other conflicts on the planet. I think that, I think it's clear that when there are heat waves and farming is disrupted in regions, that is likely to precipitate, to precipitate armed conflict. And so yeah, in so many ways, this, this ability to be able to feed everyone on the planet without further increasing our carbon footprint, without further deforesting the planet. Lofty goals, big challenges. And it's great that Syngenta are up to it, and that data science and machine learning are playing a role. So, yeah. So, a specific kind of question related to this, something that I've read about heard about, but don't personally know much about is something called precision agriculture. So, how can we use precision agriculture to boost productivity while mitigating risks?
Feroz Sheikh: 00:10:01
Historically agriculture has been a practice of dealing with averages. So, you have a large field out there, you plant an average rate of seed, the plants grow and you spray, you know, broadcast herbicide and, and so on, right? So, it's, it's been, you know, just dealing with averages. But it's natural to understand that not every part of the field is similar. You know, soil quality changes within the field. The productivity and the nutrition available to the plants changes, or the risk of let's say weeded emerging in different parts of the field is different. And, and that's, that's where precision agriculture is a shift from those averages. Bringing it down to, you know, dealing with different parts of the field in different agronomic protocols and interventions. I like to see it as, you know, going down from these averages to caring for every single plant. So you get spatial precision, providing that care to the plant when the plant needs it, that is temporal precision. And then down to genetic precision where we are looking at, you know, exactly what maybe the weed species it is, or the insect species, and what's the best-recommended intervention for that might be. So, that's, that's what it is. I mean the listeners would appreciate going from averages down to dealing with every specimen through you know spatial, temporal, and species-level precision.
Jon Krohn: 00:11:50
Very cool. Spatial, temporal, and species level. I love that idea. So, I guess there's eventually a goal to be able to use data collection to really be able to track. I imagine that's not something that's pervasive globally right now, to be able to track every individual plant at every time point and know exactly what's right for them. But maybe that's the kind of thing that that you guys are starting to work on. And so I guess this would eventually allow us to have what we call a digital twin of a real-life farm.
Feroz Sheikh: 00:12:25
Absolutely.
Jon Krohn: 00:12:27
All right, so speaking of farms, Jeremy, you grew up on one and you have over two decades of experience in agronomy, and you're now heading computational agronomy at Syngenta. So, what does agronomy mean and what makes it computational?
Jeremy Groeteke: 00:12:42
Jon great to be here. So, yeah, I grew up on a family farm and as Feroz was talking there on the precision ag., I got to thinking back. So, the first time we implemented precision farming on our, on our actual farm was in 1995. We implemented a technology that was called yield monitors. And so very simply, it's a mechanistic pressure plate. And so as the grain would go by, it would read different pressures. And that pressures, through data science, I mean, this is old school data science, right, was converted and calibrated to give growers bushels per acre. And that was really the first time that growers were able to move from what Feroz mentioned on field averages, because for up to that point, most all growers would be this farm yielded 180 bushel. But within that, every grower knew that it ranged from 220 to 120 bushels. There was variation that happened, but they would farm on the averages.
00:13:44
And so, you know, what is computational agronomy and what is agronomy? Agronomy is the science of growing, you know, plants at the end of the day. It combines soil science, environmental sciences, as well as plant sciences. And agronomy really is the nexus to bring all of these disciplines together. You know, in our own organization, we'll have plant breeders, we'll have soil scientists, we'll have climatologists, but agronomy is this nexus to bring all of those components together. So, no different than you water your grass out in your yard, right? Agronomy is used in turf management and, and, and vines and grapes. It's to bring all those together to maximize production ultimately with the minimal inputs as you can.
00:14:30
And so what is computational agronomy? Well, this is the codification of those processes and principles, or the rules of crop science, as we would call it. And so what does it mean for water to move from the ground into a plant and out the stomatas, right? Transpiration. How does that work? And so we really want to get into the science of that piece and codify it so that we can build these digital twins. We have to understand these principles so that we can replicate them in a computer, in an environmental world. And so I'm sure we'll get into it, but we use many principles and many different components from mechanistic models to machine learning models to empirical, all types of different models to represent and reproduce what I used to do as an agronomist, which was very manual, right?
00:15:25
As I went to school, I learned all these sciences put them into my head, through memorization, and study, and education. And then I would walk into corn fields and soybean fields and, and diagnose what's going on? Is there a disease? Is there an insect? Is the plant stand correct? Like all of these things visually and cognitively. And now what we're doing is taking that same components and converting them to machine learning and technologies. And frankly, the advent of machine learning and computer vision has just accelerated this. So, it's an old hat. When I went to school, it was just called statistics. Now it's way cooler and called data science. But, in the new technologies that come.
Jon Krohn: 00:16:11
Are you stuck between optimizing latency and lowering your inference costs as you build your generative AI applications? Find out why more ML developers are moving toward AWS Trainium and Inferentia to build and serve their Large Language Models. You can save up to 50% on training costs with AWS Trainium chips and up to 40% on inference costs with AWS Inferentia chips. Trainium and Inferentia will help you achieve higher performance, lower costs, and be more sustainable. Check out the links in the show notes to learn more. All right, now back to our show.
00:16:49
Yeah, that is exactly right. It, this in this field, probably all four of us have been doing it for a while under the guise of statistics. But now, yeah, we have much more computationally intensive algorithms. And so I think that's part of why we end up calling it data science, this big umbrella term that captures not only statistics, but also, you know, machine learning approaches that aren't statistics. But very cool to hear how computational agronomy is, is taking all of the things that you learned about optimizing yield and soil management, and piece by piece, I imagine you're kind of trying to say, okay, where's the next piece of low-hanging fruit here? Figuratively speaking in terms of being able to yeah, to say, you know, what's, what's the next thing that we can automate?
00:17:40
How can we get a machine learning algorithm to learn this task? A machine vision algorithm to be able to capture the knowledge that you learned in school, Jeremy, and be able to ensure plants are healthy, they're standing tall, and bearing lots of literal fruit. So, yeah. So, let's dig into that a little bit more. So, how are things like satellite imagery and remote sensing used to monitor fields already today and enhance operational efficiency? And then what do you kind of see happening as next steps? What are the next bits of, again, figurative low-hanging fruit?
Jeremy Groeteke: 00:18:16
Yeah, I think just a little bit of the history on that as we move into some of these topics. I think, you know, ag. we've been trying to solve a lot of these components and problems, you know, as a point source, right? So, trying to figure out how to do nitrogen management or stand counts, or disease management. We've been really doing these as point source components when the reality is, is they're all highly inter interdependent. And have, it's a very complex ecosystem. Mother nature's not easy, right, at that component. And so as we've gone forward, we have these multiple scales that we have to operate at. We have to operate and understand what's going on at the plant level. And so that's where we look at technologies like with, through phones and very close up camera and resolution so that we can take a picture of a disease and use computer vision technologies to identify and train those models so that the human force can be, you know, scaled in 10x.
00:19:13
And so they're able to very quickly go from what is on my plant to how to treat and recommend that plant. But that's at a leaf level. And then you're gonna move backwards, up at scale. And so around the world, depending on what country you're in, farm sizes have continued to get bigger. Whether you're in a country that starts with one hectare and they went to five hectares, they're still getting bigger in scale. Or if you're talking about in the US, Europe, Latin America, where they're in the 10s and 20s and 30 thousands of hectares that growers manage. And so the complexity of management and scale of management has continued to grow. And so that's where we start looking at other technologies like remote sensing to be able to help growers monitor change management. Right? Agriculture's always through constant change because mother nature's not consistent. Weather and climate make a big impact on the plan that growers have for the crop growth and development. We may get drought, you can see the heat waves around the world, or floods or rains, and that changes how growers have to manage. And so this is where we really got to be dynamic in the data science to incorporate those environmental classifications to really understand the impacts. The law of the average in this world is definitely not going to work. We've really got to be dynamic and understanding when we see extremes forecasting them and what the impact of that is gonna be into the growth of those crops around the world. So, it really comes down to the scale that we're trying to manage, Jon. If we're trying to look at a plant level, we're gonna use very specifics like Feroz mentioned species-level genetic information combined with data science on how to manage that, right? Is that specific genetics tolerant to a disease? Is it susceptible?
00:21:07
So, when I take that photo and I know that's, it's there in presence, what is the response we expect from that plant? Will it be able to hold off that disease or will it be susceptible? And then combine that all the way through to the recommendation. At the end of the day, the grower wants to know what action they need to take. And that's really what we try to do. We simplify and do all the complex data science internally, but really we just want to give the grower a recommendation of what they need to do as a next step.
Jon Krohn: 00:21:36
Yeah. Let's dig into that a little, in a little bit more detail so that we can kind of understand not just at a high level, but maybe kind of like with a case study or two, when an individual farmer, wherever they are in the world, when they're working with Syngenta and they want to be optimizing their yields, I mean, that seems like an obvious thing to be doing. They want to be making the best use of the current soil conditions that they have of, you know, the anticipated climate conditions over the coming years. So, when they're, you know, planning for the harvest, I mean, yeah, if you could dig into like a specific use case where like, you know, if I'm a farmer, how do I interact with Syngenta? And then how does Syngenta provide recommendations for how the farmer can be optimizing their yields?
Jeremy Groeteke: 00:22:24
Yeah, no, a great kind of near-term use-case that just happened this growing season here in the southern hemisphere in the Argentina market is, and so many of you may hear of El Niño, La Niña effects, right, that go on from the climate and their effects. That [inaudible 00:22:44] can have significant impact on what the forecasted weather can be. And so we actually have internally have developed kind of a yield prediction model. And so what we can do is when we bring in soils, climate information, we can help growers understand what their yield potential is going to be in the upcoming season. And so we were able to this last season with growers knowing that the long-term forecast of El Niño was coming in in La Niña, what the difference is we incorporate that into the model. We're able to now then place product recommendations and density recommendations for those growers. Because the forecast was for a dry season coming up before they even started planting. And so we were able to communicate that to growers and help them make better seed selections, as well as management selections going into a conservative approach of drought that was going to be potentially coming on. And in hindsight that played out because again, the El Niño, La Niña effect is very strong in that part of the world.
00:23:52
And so again, it's a way of how we bring in multiple streams of data, combine them into a digital twin, and understanding what that impact is going to be, and then converting that to a product recommendations of the right hybrids and right seeding rates moving into a dry climatic region where their yield potential was going to be reduced. And so that's really how we get way more efficient with growers and helping them become, stay profitable and be profitable and to produce food without wasting resources. As an example there, Jon.
Jon Krohn: 00:24:27
Very cool example. Yeah, I read just yesterday that South America and Canada had bumper years for wheat, I think it was. And so helping keep those costs down globally this year. So, then specifically in terms of interaction, like the mechanics of that, Jeremy, like, does, does the farmer like log into an app on their phone or they sit at their computer? Like how, how does it work when, when you convey this information to them? And how do they get kind of looped in general? Is it that they, you know, they, they're looking for someone to buy seeds from and they're like, oh, if I work with Syngenta, then I can get all this useful predictive information that is likely to make sure that I get a better bang for my buck?
Jeremy Groeteke: 00:25:18
Yep. So, growers, much like the general population there's a bell curve of adoption, right? You have the bleeding edge, leading edge, and the majority of the middle. And so we do have some growers that adopt the technology by themselves and log in and use our systems directly. But the reality is the majority of farmers that we deal with around the world work with our field staff, our channel partners that really deliver this information to growers in this world. Because, again, a grower by himself or herself is, you know, they're a banker, an HR manager, they're agronomist. They have to do all of these tasks because they're running their own business at the end of the day. And so they really rely on the support of a lot of agronomists and advisor networks around the world.
00:26:09
And so we deliver a lot of our solutions and technologies through either direct route to markets or channel partners to the majority of farmers. But yes, we do have farmers that work directly with our solutions as well. And, and I'd say this has probably been the thing that I'll group Ag. Tech and kind of throw us maybe all a little under the bus here a little bit. For the last 10 years, we've really not made things completely simple. And so we've really got to make a massive improvement. I think, I'll be honest, I think all of Ag. Tech really underestimated user experience in developing solutions in the first half of this journey. And that has significantly changed in the later half here, in the last three, four years. Like, teams have user design, user experience team, and people to help make this much more simple. We can't just put a big complex model out there and expect them to understand it and use it. They don't care, frankly. And we've got to simplify the experience. I think that's gonna be the big piece for us. And I see large language models, frankly, helping a lot with that as we move forward.
Jon Krohn: 00:27:20
That was gonna be my next question. I was like, ah, this seems like such an obvious application. You could have large language models being able to answer people's questions. Like in a, like a kind of a chatbot experience. They go to syngenta.com and just have this like interaction or on their web app, or it could even be by voice. Like they don't even necessarily need to be able to type. And in fact, there's probably a lot of farmers over the world where even literacy is you know, not guaranteed. And so you could even have it be voice where they just speak to their phone in whatever language. And then the LLM can speak back and say and provide recommendations and provide a huge amount of information. Like, you know, yeah, there could be things particularly in the developing world where there's just an education component to it where, you know, you can explain with his voice, this LLM can just explain you know, oh yeah, you know if you're interested in hearing more about next year's climate and the kinds of things that are going to affect it just ask me some questions and it can, it can answer in real-time. That's such an amazing opportunity there.
00:28:35
And so I think that could potentially allow you to tighten the loop a bit where, because my guess is when you're relying on channel partners or that kind of thing to convey information, you can't be sure of how the information is being delivered, and you don't get as tight feedback back. But if they're working with you directly, then there's this opportunity to be not only getting verbal feedback, kind of in real-time, but you could, this also is now one step closer to people being able to provide the kind of precision agriculture that Feroz was talking about where you know, specific where we have sensors, and like, and this is like now step, several jumps from just having a phone, but, you know, having sensors that are recognizing individual plant health at a specific time. And Syngenta being able to directly advise and maybe say with an LLM, "Hey, it looks like that plant has this particular condition. You're gonna need this kind of treatment for the plant. Maybe next season you could do this thing differently with the way you've set up your farm to avoid this kind of issue happening in the future." Really a huge amount of opportunity here, I imagine.
Jeremy Groeteke: 00:29:49
Yeah. Yeah. Actually, ironically, we just got done completing a hackathon here. And the code name was Jarvis. So, for the Marvels crew and Tony Stark fans out there, right, everybody knows Jarvis. That's literally the concept that we were putting on the table is, is how do these LLMs through voice speech or text-to-voice conversions help this communication and deployment piece. Language barrier is a huge piece around the world and simplifying this down, and we do see this as a potential as we move forward. I think we have to watch out. I mean, everybody's aware of this, the hallucination piece that comes from the large language models currently in the tech stacks that exist. And we see this really, again, I think I see this as an interface more than the model that is sitting behind them delivering the insights. We will leverage other models that actually deliver the insight or produce the insight. The LLMs will just relay it in a familiar tone and voice and communication style. But they're not, I don't think today going to be the ones that actually are the model delivering the insight. We will have other models behind the scenes that are put together in our orchestration hub example, to deliver. But the LLMs and text-to-voice will really be a user interface at the end of the day.
Jon Krohn: 00:31:20
Deploying machine learning models into production doesn’t need to require hours of engineering effort or complex home-grown solutions. In fact, data scientists may now not need engineering help at all! With Modelbit, you deploy ML models into production with one line of code. Simply call modelbit.deploy() in your notebook and Modelbit will deploy your model, with all its dependencies, to production in as little as 10 seconds. Models can then be called as a REST endpoint in your product, or from your warehouse as a SQL function. Very cool. Try it for free today at modelbit.com, that’s M-O-D-E-L-B-I-T.com
00:31:58
Exactly how I was imagining it. Yeah, that makes perfect sense to me. All right. We've had lots of head nodding from Thomas and Feroz, Thomas, let's jump to you since we haven't heard too much from you yet. So, you are the Head of IT for R&D at Syngenta. So, how is the application of data and AI different for research and development than the kinds of things that we've been talking about so far with Jeremy and Feroz, largely where we were talking about data and AI applications, data and AI applications with farmers directly? How is it different for R&D?
Thomas Jung: 00:32:35
Yeah, I feel blessed actually, in comparison to Jeremy and Jeremy's world, because the biggest difference is that we're able to control for almost every variable, right? Where out in this world, a plant is not just a plant, right? But there's so many variables from soil to weather and everything in between. And we can control all of this in R&D, obviously not the weather at a big scale but of course, locally in any experiment that we do, right? So, we have much better means of creating the data we want and understanding deeply how plants react to whatever triggers we expose them to. So, which is a beautiful thing for science. So, it helps us innovate, but actually it also creates a couple of challenges and limitations, because when we do this, how can we translate this into the real world back again, right? And that actually is to me, currently the biggest challenge in data collection and also in putting data together, right? We're so smart in the labs, we're so smart in experiments that how can we be sure this is exactly what happens out there in the open field, right? And for basically bridging that gap, right, that is where we don't have enough data, quite generally speaking. But what we can do in the lab is, is amazing, actually.
Jon Krohn: 00:33:55
Very cool. Yeah. So, this is in contrast to a typical farm set up today, I imagine. I mean, I'm sure there's, there are farms or, you know, it's probably something that's that's a, that's a growing area where you have farms trying to have more comprehensive censoring. But that I expect today isn't the norm, whereas you're describing in the lab you could either have internal, like indoor experiments, I guess, where you literally control everything. Or you could have outdoor experiments where other than the weather, as you mentioned there, you can control all the factors, and you can be collecting as much data as you want. So, you can have the, this precision agriculture that Feroz was talking about at the beginning. You can really have that in the lab. You can have all of the spatial, temporal, and genetic information on an individual plant level at a exact point in time to be providing guidance. But then, as you're saying, the tricky part then is translation to the real world, because yeah, in the real world, you are only in, I imagine very rare circumstances going to have that level of spatial, temporal, and genetic precision. And so, yeah, so how, how do you do that? How do you, how do you think about experiments and, and try to bridge that gap as best you can?
Thomas Jung: 00:35:18
So, we're actually, we're learning, right? We're, we're learning every day still, right? So, we just recently had a, had a beautiful example where we positioned hundreds of sensors in a field, in a trial field, right? To capture data about the soil, capture data about almost every individual plant and happening to that plant. And of course, we knew the weather for this field, right? From, from our global data. So, what we saw actually is a pattern that our scientists, our data scientists, couldn't explain, right? Because while the field was doing pretty good, couple of plants were doing really, really bad, right? And the data collected by all our sensors didn't give any clue, right? So, data science hits its limits, actually, when we're out in nature sometimes. So, when you walk into this field, what do you see? Very, very simple, frustratingly simple actually. Those particular plants just stood in the shadow of a big tree, right? Simple as that, right? And the only thing we didn't measure was direct sunlight at the particular spot, because we thought sun is sun in a given area, right? So, and this is just like we're, we're capturing all the data in the world, and we still miss something, essentially. And that's actually the beauty of our job, right? That this is actually a field where we're learning every day, but we also have such a wealth of data in our hands that we can do huge lot already with what we have.
Jon Krohn: 00:36:49
Fascinating. Yeah. So, so big part of the learning there is figuring out how you can get more sensors out there. And so the kind of work that you're doing, Thomas, is you're thinking ahead in terms of what's possible in farming and figuring out how, how this is gonna work. So, that, I kind of talked about that tight loop earlier. You have the tightest loop in your lab where you know, you really are in control of everything, but you are, you are imagining, okay, a few years out, maybe a decade out how can we be having our, have these kinds of sensors or tools in farmer's hands so that they're, they're measuring sunlight in a specific area, and we can know that there's, you know, there's a problem due to the shade of another tree. So, yeah, that's really fascinating. So, we talked with Jeremy about generative AI, about LLMs and the ways that those can interact with a person through generating natural language, whether that's written or spoken. I understand that you're involved in something Thomas called Generative Chemistry. I've never heard of that before. What does that mean?
Thomas Jung: 00:38:00
Fantastic. Yeah, it's this, it's, it's great space, actually, before we're even hitting plans, right? It's understanding chemistry through large language models, but all, all types of artificial intelligence in most general terms, right? And if you want to stick close to language, probably a good one is retrosynthesis in that space, which you may not have heard of either, right? So, and I haven't, because I'm not a chemist, right? But it's fascinating to learn. So, essentially, this is decomposing chemistry, right? And understanding reactions required to produce a certain desired molecule, right? So, once we know what molecule actually we want to create to try the effect on a plant, right, we typically have scientists trying this based on all their historical knowledge and their brains and their smarts, right? Which is a bit of trial and error, and very often succeeds, but needs a couple of runs, right? And what has been happening in the industry, that's not a Syngenta invention. What we're doing together with many, many partners is to apply large language models to chemical reactions, right? Because in the end, chemistry is a language, right? And quite literally, actually, some of the models behind Google Translate have been used in chemistry, right? Saying a reaction A + B = C, right, is kind of a language, right? It's a very particular language, but the fundamental model is reasonably similar.
00:39:38
So, what we're able to do with all this is essentially having the algorithm suggest chemical roots, that's how we call it, right, essentially, suggest experiments, suggest reactions to do, to create a desired molecule, right? And the algorithm would generally get it right. Where we're at at the moment is that we get a set of 10 probably recommendations, and we'll have our scientists look at these signage, check those, and most likely they'll find one route that actually is very much an appropriate one, right? That is able to create the molecule that's actually a very novel molecule.
Jon Krohn: 00:40:20
And so this sounds somewhat familiar to me, to the kinds of research that DeepMind does with AlphaFold, where you are using the sequences of genetic information or protein information to predict what a protein structure will look like in 3D. And this has been historically a task that is extremely compute-intensive, not very accurate, but just in recent years, these kinds of computational approaches have become suddenly extremely effective. And for some particular kinds of these, at least protein structure prediction problems that AlphaFold is tackling, that DeepMind is tackling with AlphaFold, sorry. They, these are like, solved problems in some cases. And in other cases, I know that it's still extremely difficult to be able to make these predictions accurately. So, earlier this year, we had Professor Charlotte Deane from Oxford University in episode number 643, and she specialized in you know, she's friends with Demis Hassabis from DeepMind, and I think the night, if I remember correctly, the night before we recorded, she'd been having dinner with him. And so, you know, they end up talking about AlphaFold and these these, these biological structure prediction problems. But Charlotte's tackling ones that DeepMind hasn't cracked yet. She has, because there's, you know, some molecules have properties that make them easier to predict than others with today's technology. So, this sounds really fascinating. What kinds of compounds, what kinds of compounds are you trying to predict, Thomas, and why is this useful in agriculture?
Thomas Jung: 00:42:03
So, what's been most inspiring for me in the past month actually, is when the models help scientists to stretch their own imagination, right? So, when we're looking for, for new products, right, new molecules that can help protect plants, they usually are generally known classes of activity, right? So, when a scientist would see a molecule, they can usually say like, this could be a viable option or not, right? And what we're hoping to achieve, and we're seeing exciting first science through these large language models and, and other models actually, is that we're, we're stretching that, that universe that scientists look at, right? And most recently, we again had a couple of recommended molecules for a certain biological challenge. And in this set, there was one very, very odd molecule. Our scientists just bluntly dismissed, saying like "This, this is never going to be a herbicide. It just doesn't look like a herbicide. This can't be herbicide," right? Then you still take a second look at it, and you'll realize that magically there is actually some activity like a herbicide, right? It's not going to be a product. It wasn't the perfect suggestion, right? That it was something that every human would've completely dismissed, but for some magical reasons, it still is a black box, right, our algorithms would say, look at this. So, and that, that's for me, the biggest promise, actually, right? That we can explore areas of chemistry that no man has ever looked before, essentially. Because the model would suggest us to go there. And, and that to me is super exciting.
Jon Krohn: 00:43:47
Yeah. Really fascinating indeed. So, Feroz, I've noticed that you've been taking lots of notes and you've been smiling enthusiastically as Thomas and Jeremy were speaking their pieces. So, I'm sure you have a lots of follow-up things that you'd like to say. But let me give a little bit of context, which is that as a seasoned CTO and entrepreneur, you've seen firsthand the challenges of deploying innovative technological solutions at scale. So, taking the kinds of things that Thomas was talking about and applying them at the kind of scale that Jeremy needs in practice on all the farms. So, what's your vision for how that gap can be bridged? How can data science and machine learning make its way out of the lab and into all the fields, all over the world?
Feroz Sheikh: 00:44:33
Yeah. It's, I mean, some of the ideas that Jeremy and Thomas were talking about are kind of like the utopian dreams that if they come true, you know, you would really start to see a big step change in agriculture. And, you know, there are practical challenges. So, when, when we start taking technology out in the field, things don't exactly pan out what you expected, you know. My favorite example is, you know, you develop a very sophisticated computer visional algorithm, and we take it out in the field, and what you have is a little bit of mud obscuring the camera. Examples like these make it harder to take the technology out in the field, and you have to then account for various scenarios like this that may interfere. Another, another example I have is, you know, some of the modern tractors and machinery are so sophisticated, they generate so much data. You know, we are really talking about gigabytes of data being generated every time that machine goes out in the field. And the ISO standards and the wires and the connectors in the tractor today are not, are reaching their limit of how much data they can transfer. So, the industry standards and bodies are having to evolve those standards to allow this much of data for precision to pass through to the algorithms that are then, you know, acting on that data.
Jon Krohn: 00:46:13
Right. So, things like 5G today you know, emerging more and more. And then for the 6G standards, I guess, that are coming in the future, and the 7G after that it's going to allow for more and more of this precision agriculture I imagine, and yeah, I kind of just spoke over you.
Feroz Sheikh: 00:46:31
Yeah. That plus also, you know, this fact that a lot of this innovation needs to probably start to operate on the edge. So, when you develop these models that, you know transferring that volume of data from remote fields, you know, back to the cloud could possibly be a after the fact sync but you allow the machine to intervene right there when it is out in the field, you know, by operating on the edge. And there is this you know, combination of applying the data to decide to create a prescription or to prepare a product and then take it out in the field. So you don't have to do all the computation out in the field, you know, based on all the data that we have seen, either through the IoT sensors, satellite imagery, or, you know, the last time the tractor was out in the field, which collected so much data we could create prescriptions, and those prescriptions can be sent as an instruction to the machine which is then geocoded, for example, let's say, you know, about a herbicide application. So, the machine goes out next time, and it is carrying out that prescription as applied.
Jon Krohn: 00:47:49
Very cool. Yeah, it's, I, it's interesting that I didn't even think of that. I mean, we talk about edge computation on the show pretty frequently, but for me it was like, oh, we just need to figure out a way to get all these huge amounts of data gigabytes today, terabytes tomorrow, let's just get some networks and get those all back to Syngenta servers so that you can be processing. But actually, yeah, the solution you're, you're describing makes much more sense, where we have the models working on edge devices, they can just be working there in real-time. And so it's, it's interesting now that you mentioned these huge amounts of data coming from devices like tractors, where on the one hand, we have huge amounts of data coming in more than can possibly be sent more than can possibly be processed. And simultaneously, we don't have nearly enough information, we're missing a lot of the information that Thomas and Jeremy would like to have in order to be able to make their best recommendations and have, you know, real precision agriculture.
00:48:47
So, yeah, it's an interesting situation where I guess we, and it's the natural way that things will evolve, where the maker of the tractor is thinking, well, you know, I can sell more sensors on this thing, and the farmer will pay for them. And so then, you know, there's more and more and more data being collected of this very specific type that isn't necessarily exactly what you need in computational agronomy to get the best out of an individual plant or get the best out of an individual farm. So, yeah. So, I guess there'll be all these changes. I mean, it's amazing how, how rapidly things change. I wonder-
Feroz Sheikh: 00:49:25
We can start to think about, you know the difference between what's perfect or what's best versus what's good enough. You know, to build upon your point that we could have the most sophisticated models that need the most precise measurements at, you know in near real-time. But then there is also a cost associated with collecting that much data and processing it and, and so on. For the most part, you know, it might be good enough if you were to take a, reduce the sampling rate or reduce the frequency, and so on. And I think that's the, that's one philosophy which we try to follow, is to strike a balance between, you know, the most perfect data point, the most perfect sensor the most granular possible versus what's good enough for the most part, you know?
Jon Krohn: 00:50:23
Yeah, yeah, yeah. And so, as, as we've, as any regular listener of the show will have heard, I'm constantly talking about large language models, open-source LLMs, the way that these are dramatically transforming the field of AI. Everyone in, you know, a huge portion of the world now is aware of tools like ChatGPT in particular. And so these transformative changes happen really fast in AI and data science. But hearing all three of you describe the problems that you're tackling, it is abundantly clear to me that not just for years to come, but for decades to come, there will be problems to solve here and situations that can be optimized. Because even if Thomas sitting in his lab is able to come up with this outstanding model, or Jeremy knows from his computational agronomy that, you know, there's this great opportunity to be optimizing yield, it's still going to be so many years, it's going to be decades before the hardware can be out there to be collecting the relevant data in the way that you'd ideally like to have to be able to compute on the edge or to be able to transmit those data for processing on a server.
00:51:44
There's like, yeah, I mean, there's, there's certain situations like I imagine in the West, there's particular situations where, you know, tractors say, are able to collect a lot of data. But then when we think about farmers in the developing world who just, who maybe just have a phone and they might not even have that today. And, and, you know, relatively low bandwidth over satellites in a lot of these places. And so there's, you can see that even if the data science and the AI progresses very rapidly today, there's still decades and decades of work to ensure that all of these advances propagate out. And then there's this really nice, investors, AI investors love to talk about flywheel effects. You've got this flywheel effect where as more and more of those devices get out there, as we have more and more edge compute, as more and more of these algorithms are developed, as these kinds of standards that you were talking about there Feroz, as those become, as the standards become standard and there's better interplay between devices and models and providers there'll be this, this flywheel effect where more and more models can be useful. And then, so more and more hardware is useful in the field, and this is just this continuous acceleration of precision agriculture. This really exciting. So-
Feroz Sheikh: 00:53:08
Yeah. And if, if I could just build on that, there is one additional factor that will happen. I mean, yes, it'll take time to perfect what works in the lab and make it scalable out in the field with all the variability that comes, you know, out in the field. But there is, there are some interesting effects which will happen. So, it may not probably take decades, it'll probably have happen sooner is a leapfrog effect. So, where, you know farmers, countries, and economies will learn from what's happening elsewhere. And I'll take a very simple example of the use of drones in agriculture. Now, the western farmers, the developed world went through several iterations of perfecting what a drone can do, whether it is scouting or spray application in a field. But the developing countries, right, would probably get there in one single leap because they don't need to redo all of those learning steps. The same drone which will need to come back for recharging, which is limited by its range or the capacity of the tank, you know, when it is flying over a 2000 ha field in US, can quite easily in a single fly pass, cover a 10 ha field in India.
00:54:38
So, the technology, which, you know, got incubated in a different market probably is going to reach its limit of applicability. But in the other markets you know, it's, it's ideally suited and that will then trigger this leapfrog, the, the flywheel effect that you spoke about, where you have farmers who are starting to get aware of what the drone could do for me. And you have micro-entrepreneurs who are coming in to help the farmer operate the drone. And then we have the data and the products that actually leverage the availability of the drone to make an impact for the farmer.
Thomas Jung: 00:55:15
Yeah. And probably to add one more of these, these impacts that can actually propel us by decades, right? I referred earlier to the challenge of translating what happens in an experiment to the real world, right? And the more actually, our farmers are professional data workers of some sort, right? And the more data we have of what happens on the actual farm and people are sharing this, and ideally we're sharing a lot of our data and farmers are sharing their data in a very open data ecosystem, right? The better we can actually understand what happens on a farm on the field, and how it relates to what happens in the experiments.
Jon Krohn: 00:55:56
Yeah. Yeah. Yeah. Fascinating. Yeah. And really exciting that, you know, I was, I, yeah, I wasn't thinking about these kinds of leapfrog effects that will allow some of these innovations to make a really big impact right away where yeah, the Indian farmer can buy a drone and then and then all of a sudden be having capabilities instantly that took many years to develop elsewhere. Very cool. So, if we have listeners out there who are data scientists or getting into data science, and they're thinking, wow, I want to make a big social impact, it's clear that there is a huge social impact that they can be making in feeding the world in all the kinds of problems that you have been talking about all three of you have been talking about in this episode. I know that Syngenta has a startup accelerator and a venture capital arm. So, there's Syngenta group ventures specifically as their VC arm. So, what kinds of challenges are there in agriculture that a listener out there might want to address right away?
Feroz Sheikh: 00:57:00
Yeah. I think one of the key problems that we are looking to solve is, you know, in this area of sustainable agriculture or what we call as regenerative agriculture we've identified soil health as a key area that has an impact on the outcome or yield that you can expect. And any kind of, you know, technology that could be looking at data collected through sensors or through satellite imagery that helps us interpret, you know, what are the various properties of the soil and you know, matching it with agronomic protocols to regenerate the soil, measure the organic soil, organic carbon and, and help quantify the carbon that has been sequestered could potentially lead to improving the yield per acre for the farmers. And it could also help generate new revenue streams for the farmers if the farmer has a reliable carbon measurement, for example.
00:58:06
So, some of those areas, you know, when it comes to regenerative agriculture could help the farmer. And if there are startups and, and data scientists and innovators out there they could, you know, think about some of these kind of problems. Another example I can take is, you know, when we start to look at entirely novel agronomic protocols, you know. When, when cultivating rice, you have to first plant the rice in a nursery, and then when the plant is, you know, at a certain growth stage, you have to take it and put it out there in the open field. That's called transplantation. But through the use of drones and you know, innovative products, what we are looking at is direct seeding of an open field, eliminating the whole transplantation step in the middle.
00:59:05
Now this may not seem like a an obvious problem for a startup to imagine, but there is a whole lot of data, data science, and how to run that drone and, and allow it to plant the seeds at the right place that can have this kind of an impact, you know, eliminating the whole step of transplantation for rice farmers. And if this happens at scale it would be transformative for everyone. And we do have some work going on in this space where we have innovative products for rice growers and drone solutions that we are trying out in con countries like Japan and Indonesia.
Jon Krohn: 00:59:47
Nice. Excellent examples there. And yeah, so I hope there are some inspired listeners out there thinking, all right, I'm gonna get a start going, start getting a pitch deck together and start trying to figure out how we can be doing these kinds of things, like having direct seeding with drones. Really exciting things, really big opportunity to feed the world through innovation. So, speaking of innovation, Jeremy, we've talked about how we want to get these innovations into the field, a lot. When you're doing that, when you want to be making recommendations to a farmer are you able to give us some insight, obviously there's, there's gonna be some kinds of proprietary secrets that you can't divulge, but to what extent are you thinking about like, using a simple heuristic model or do we, or, or a complex machine learning model when you're thinking about tackling a specific agronomic recommendation for a farmer? Yeah. How do you go about designing your ML models?
Jeremy Groeteke: 01:00:53
Yeah, no I think and some of the listeners have, have had one of my colleagues so you know, I hire great people like Serg, right? Another data scientist on our team. And really we try not to confine them to any one type of model technology. It could be a simple weather-based or heat unit or heuristic. We could use Bayesian machine learning. Like we, we really don't try to limit the teams and say, only use this methodology or this tech stack to be able to solve the problem. So, we have examples where we view CNN's all the way through for machine learning of yield prediction through satellite imagery, and training datasets. We have very basic empirical models that are calibrated and validated to deploy and have disease estimations of that coming out, to very complex and data-intensive mechanistic models that can come and deliver.
01:02:02
And so we try not to say, go one way or the other. And frankly, what we've seen is it's the combination of these models. And so because Ag. it is a cyclical world, right, you plant a seed, it grows, you harvest it. The timeline is very different if it's a tree or a leaf of lettuce that you're growing. But that timeline just changes. But it's the same basic principle you plant and harvest. And so there's always an off-cycle in this, in this world, a fallow world that doesn't happen. And so we may not always get data sensors you know, active during the season. And so we have to have models and, and systems that work when the plants are not even growing so that we can say what is coming. And so this is why not always is machine learning the solution, is a mechanistic the solution. Sometimes we need to have both operate in conjunction or an orchestration of these models and bring them together.
01:03:08
And so that's what I would say, honestly has been probably, I think, the breakthrough on the team. We really try to hire data scientists that have skill sets in all of these. So, my pitch to every audience member is come work for us. We're really cool and sexy and we work on every single tech that you can think of. It's not just Google, Facebook, and Uber and all them like Ag. dives into the newest tech we are working on. So, please I'm hiring, come look. But I think that's the thing that the team has unlocked. And as we bring data scientists together that may be very specialized with Transformers or CNNs or a large language models or, or old school guys doing principal component analysis or mechanistic models like the unlock, frankly, and I don't think it's trade secrets, but it's, it's combining these technologies and really stringing them together to bring them out and unlock.
01:04:12
Because frankly, when we've gone down one path, like we hit a roadblock because we, the data source doesn't become relevant, it's trained on a very limited data set. So, we've had it where we can train a machine learning model for say the North America market. And then when we move to the Latin America market, it doesn't transfer. And so this transfer learning piece, we've really been exploring and trying to understand because we want to operate on a global scale in that piece. And so this is some things where we have to have almost one model as an input to another model. And this is a framework that we've actually been developing behind the scenes. It's kinda like going to your kitchen. You know, each model becomes its own thing. And so it could be the flour, it could be the sugar, it could be the honey, it's an ingredient. So, we may have a model for phenology, and a model for yield prediction, and a model for weather. Well, those are all individual ones, but we're gonna plug these together to actually deliver a true insight. And so this is really where we're starting to go as models are becoming ingredients into other larger complex models. I think we went down this path of like, oh, we just need this. And the reality is that's not the case. And, and they also feed inputs into other models.
Jon Krohn: 01:05:36
Fascinating. That makes a lot of sense. And how rude of me to suggest the venture arm and startups as the first route for our listeners to go to when they want to be making a big impact. With agriculture, of course, they should be looking at Syngenta as well, so they can really hit the ground running and take the advantage, take advantage of experts like all of you three and all the other data scientists like Serg that are there. Yeah, it sounds like a really amazing opportunity where in that kind of role, you're then working with all these different kinds of hardware, all these different kinds of problems with drones, with machine vision problems, weather problems, soil health problems, yield problems. There's, yeah, like you're saying, all of these different ingredients, which can be different models. And then I imagine there's also fascinating things around getting the data engineering and the machine learning engineering, right? So, that the data flows and the model outputs respectively are well optimized so that when you have all these models chained together they can still be performant in real-time in production. So, that's, that does sound, yeah-
Jeremy Groeteke: 01:06:45
No, Jon, it is spot on. And this is a, a piece of, I think really where we're headed in insight. So, frankly, there is a foot army of advisors and agronomists and people walking, walking fields around the world, and they're taking notes and observations, whether it's pictures or free texts or whatever. And this is this direct feedback loop that can happen in the real-time world of models. And so we could have a model that says Tar Spot is coming into the environment and real-time on-the-ground validation happens when an agronomist walks out in that field or a scout and says, yep, Tar Spot's here. And so that automatically can feed back into the training in real-time. And so this is really where we start seeing, I think the the air bars shrinking and the prediction accuracy, moving at really rapid pace as we make these connections. And so this is some big architecture and, and key components.
01:07:46
You know, it's like Tesla, right? Every mile they drive improves and develops their model because they have this feedback loop, right? And we see the same methodology or process coming in Ag., whether it comes from a tractor driving through the field or a human walking it. We're gonna have this much more automated data stream coming that we've never had before. And so that's the biggest thing, because all this data's great, but we have to get to causality and repeatability to really drive and develop the insights. And, and I think that's the piece that we've struggled over the years is like Thomas mentioned, oh, I needed to know the sunlight to know that that shading, right, you know, the smoke that comes across North America from the Canadian fires, what's that doing to our solar radiation reduction for photosynthetic capacity that is going on? And so these are things that we have to get causality figured out for and codified. And this is the key thing. Like we know it in our heads because we've been trained as a scientist, but codification of that and getting it into algorithms and combining them is really where we start making these jumps.
Thomas Jung: 01:08:59
Yeah, just, just to build on this point, because it's, it's so exciting how actually science is changing because exactly what Jeremy's saying. Like where historically, I mean, traditional scientists would have a hypothesis, right? And try to prove or disprove that hypothesis through an experiment, right? And what we're doing now is actually we are creating the data we want, and we are running experiments to train our models, right? We're not proving or disproving anything anymore. I mean, we do in places, but in other places, actually, we would run experiments just to create training data, right? And that to me is a fundamental shift of science, right? Where it's not about right or wrong or testing, trying something, but actually we have physical science in service of data science, and that's just fascinating that we have this ability to create the data we need.
Jon Krohn: 01:09:52
Yeah. Really fascinating. And I imagine there's things that you can do on the R&D side, Thomas, to be helping with that. So, I've read about smart growth chambers, for example. Is that helpful here?
Thomas Jung: 01:10:03
That is actually, I mean, it's, it's, it's part of, part of that story actually, right? So, we got a couple of dozen growth chambers. They're probably as big as your garage. Well, not your garage specifically. So, a bit in general-
Jon Krohn: 01:10:18
[crosstalk 01:10:19] Right.
Thomas Jung: 01:10:20
Not 20 Teslas in your garage. So, any regular garage size, right? So, a couple of dozen of those, those chambers where we can essentially simulate any climate that we want, right? And that simulation is just the input, right? What to me is even more appealing is that we also control the data output, obviously, right? Because it's a very controlled environment. We can measure whatever we want. We can measure every single plant. We can have photos, images of every plant, every minute is rarely useful, but every hour, for example, right? So, we're truly in full control and all this in the conditions we want. So, we can model very rare events, we can model specific climate zones, we can model for the impact of El Niño, as Jeremy mentioned before, right? So, and we can actually do all that and take all the data points that we need to feed our models to then do magic with it, essentially.
Jon Krohn: 01:11:19
Amazing. That sounds really exciting. So, these smart growth chambers, they're garage sized and they allow the, I think there's a futurist named Michio Kaku. He's written a number of books and hosted TV shows. One of his tenets of being a futurist is that the future is already here. It just isn't evenly distributed. And so what you're describing there with a smart growth chamber, this is, you know, you might have a few dozen of these garage-sized environments, but this is a vision of the future at scale. This is what farming is going to be like all around the world in our lifetimes. Really exciting.
01:12:04
So, I posted that I would be hosting all three of you on this show, and we got some great audience questions. I'm gonna start with our regular listener, Matías Baudino. He actually frequently has questions for our guests. He's a BI analyst at a company called Brain Technology. And he says that in general he loves these kinds of episodes where we can grasp great practical uses of data science and, his first question is, he wants to know what it's like to work with the amazing Serg Masís? And he says, jokes aside, what role would data science and machine learning play in the face of global warming in particular? So, do you say simulate crops in the worst conditions and how we could potentially shield crops as climate conditions worsen in the future?
Feroz Sheikh: 01:13:00
Yeah, and this is, this is an active topic of work for us as well. So, you know, spot on, fantastic question. I think this will affect, in or, you know, manifest in probably two or three different ways. The first is through helping discover new products through the R&D, you know, work like Thomas spoke about. Products that are able to withstand the climate change. You know, just to give you an example. We have one of our corn varieties, Artesian corn is able to convert whatever moisture available into yield more effectively. So, you know, it brings a great amount of drought tolerance. The discovery of Artesian corn has gone through you know, in our labs when we analyze tons and tons of data, identifying which genetic traits and which varieties respond to, you know, the drought conditions and so on, ultimately allowed us to bring this product to market.
01:14:09
Now, a lot of data science and innovation that went in, it may not be applied, you know, through a computer. It's actually applied through the genetics that are present in this variety. The second scenario is when technology directly comes, becomes helpful to the pharma. For instance, we have one of our digital products is Agriclime that allows a farmer to mitigate the risk of adverse weather conditions. It's a FinTech solution. It allows a farmer to, you know, behind the scenes Agriclime creates models predicting what the weather is likely to going to be for the season, and it allows a farmer to, you know get a bit of mitigation in case the weather pans out, let's say has the extreme heat or drought conditions and so on, right? So, these are examples where the growers are directly interacting with technology and data and the data science, either in making that prediction of whether weather changes or dealing with the impact of weather changes. But yeah, maybe Jeremy, Thomas, any builds from you?
Thomas Jung: 01:15:23
Probably pick up on Matías' point about the amazing Serg, right? Because this is, this is very, very close to my heart. I mean, not just Serg but actually people around us, right? It's such a pleasure to work with such a smart bench at scale, right? And also that diversity. I mean, I lead IT for R&D, right? You would expect this, this bunch of IT nerds, but actually this team consists of biologists, chemists, data scientists, and all kinds of IT professionals, right? And then we're working with all the PhD chemists, biologists, soil scientists, etymologists, you call it, you name it, right? So, it's such a great group to be with. It's, it's just fun. I mean, with, I'm with, with you, Jeremy, right on, on this pitch for like, this, this is the place to enjoy for an IT professional. For, for me it is, I got stuck here, right? I've been here for 12 years and I never thought I would stay for that long, and I'm enjoying the ride.
Jeremy Groeteke: 01:16:23
Yeah. Jon, I just wanted to build on his question around the science side. You know, as we talk about climate change and that nature happening, right, CO2 levels are supposed to rise. Well, actually, the cool thing about modeling is plants benefit from CO2. Actually there is, there is a good side of this whole thing whether you-
Jon Krohn: 01:16:41
Whoa, I had never made that connection, whoa-
Jeremy Groeteke: 01:16:44
Whether people wanted to know this or not, but as CO2 levels rise, yield goes up, you see a plants bring in CO2, and then they produce sugar and glucose to provide yield. So, there is a natural increase in plants in vegetative mass as CO2 grows. Now, there's a whole tons of other things that go wrong, but through the science and modeling, we can actually model out the impact of that level in what will drive yield. We can also see what temperature extremes will do. Like what do we need for heat stress, you know, simple things like, hey, if we change the leaf angle from, on corn from flat to more upright, what does that do to photosynthetic rate and penetration of light through a canopy? So, things like this, we can, you know, artificially manipulate what the outcome will be.
01:17:34
Fast forward in generational modeling, through whole genome type modeling, to understand should we, you know, in a, a world of, of high sunlight, high heat, should we have a canopy upright or open, and what will that do to the whole overall process? And so, yeah, the whole science and ability to leverage this and combine it with agronomic and ecological modeling is huge and allows us to play forward in time what are some of these impacts and what they do. So, there is a good side of this whole thing. Yields go up naturally at the end of the day.
Jon Krohn: 01:18:14
Wow, that's wild. Yeah, I'd never thought of that. Yeah. So, last audience question here for you. I mean, second and last. So this is from somebody who I actually, I don't know their real name. So, they interact with my LinkedIn posts very regularly. They go by initials. So, SM M. is what they go by. They're based in Pennsylvania. They have a PhD in cell and developmental biology. But interestingly, this person, it seems like this mysterious initial only individual has a LinkedIn account, it, yet, it seems like interacting with these SuperDataScience posts is a big part of what it's there for. So, like they don't have any followers but they often interact with the posts I make and have great questions, great points, and are always super positive. So, SM M. whoever you are, thank you for all of your support.
01:19:18
SM M. had lots of questions. I think we answered a lot of them over the course of the show. Their final question I think is a really great one here. So, they asked, how much do rare events throw things off? So, you guys are working in the real world. You have models that depend on historical data for the most part, but rare events happen particularly, you know, yeah, I guess there is this upside to more carbon dioxide in the atmosphere that means greater crop yields. But one of the downsides is that there's a lot more variability in climate systems and weather, and it's difficult to predict. You know, we've never seen these higher CO2 levels before, and so we, we don't, we can't have a perfect model of how that's going to impact agriculture, how it's going to impact the world. So, yeah, how do rare events impact the way that you model at Syngenta?
Jeremy Groeteke: 01:20:18
I can take a stab at it off the gate here and let the other guys go. So, you know, rare events versus trends are, are a big difference. And so trend lines are very relatively easy to model out. And I would argue, you know, the CO2 has been a good trend line component that we can, can deal with. The rare event problem really happens with the extreme weathers that we've seen. So, we've seen massive spikes in heat waves. I know just last week, I think multiple world records were made on total temp high heats both in Persian Gulf and China. And I know our Southwest and the Phoenix market is, is really super hot. And we get wind events derecho events here in the Midwest with extreme winds in brittle snap and laying crops down.
01:21:08
And so this is the bigger problem I think we see in the component that it just, you can't forecast it, right? Because they happen so fast. And so that's then the problem and challenge. And so unfortunately to kind of start where we, or finish where we started, like this law of averages starts to come into play back again, because we have to look at, okay, if we're gonna get these rare extreme events of 80 miles an hour derecho winds that move across the Iowa Plains or Illinois and just lay corn down, what do we need to do to be able to withstand those type of technology or that type of event, right? Is it more robust root systems? Is it better stock strength? Is it plant height? Because we don't know when they're gonna happen, and so we might have to kind of go back to a law of an average if those events are gonna happen more in the future in interactives, we actually have to go back to, okay, we need to fundamentally maybe change the working system to be able to handle these rare events because you can't predict them, at least yet. Maybe that'll come. And so you in a sense where Feroz started for some of these things, we have to go back to an average to help mitigate those risks and, and deal with them because they are so extreme. And looking at those and so we do look at that, you know, do we need to shorten plants up, make them more robust for wind, as an example. Heat tolerance too is when we know those spikes are gonna happen, do we need to improve the overall heat tolerance or drought tolerance? Because, you know, it's hard to manage extreme outliers.
Jon Krohn: 01:22:54
Awesome. Yeah, really exciting things to be tackling as data science problems, really big impact that can be had through working in agriculture like at Syngenta or in a startup or some other agricultural company. But certainly sounds like Syngenta has a lot of great people and would be an amazing place to work, an amazing place to be making a big global impact. So, wonderful that all three of you, you know, leadership, you know, going up to the highest levels of leadership at the company that you were able to take the time and be on the show and provide such an entertaining and informative episode for our listeners. Before I let you go, I always ask my guests for a book recommendation. So, Feroz, maybe we'll start with you.
Feroz Sheikh: 01:23:37
Maybe I'll just pick the book that's, that's on my table right now is, is a book by an author that I know it's called Drawing Data with Kids. As a father, you know, how does he teach his daughter the importance of data and the analysis? It's by Gulrez Khan who's my wife's brother. That's why it's on my desk. But yeah, I would, I would strongly recommend this as a book. Fantastic read.
Jon Krohn: 01:24:06
Very cool. All right. And Jeremy?
Jeremy Groeteke: 01:24:10
Yeah, mine's pretty easy for me. It's Simon Sinek's The Infinite Game. In that you know we often get caught up in winning quarter by quarter in the corporate world, but the reality is food agricultural feeding the world is an infinite game, an infinite mindset there. There's no ending to this. We have to continue to get better and I think it's just a great read and, and great mindset as we tackle the problems we're facing.
Jon Krohn: 01:24:36
Awesome. And Thomas?
Thomas Jung: 01:24:36
I would go for Louise O. Fresco with Hamburgers in Paradise. So, and it's not just for the title, but also it, it's a very wild ride through the history of agriculture, through the history of food we eat, right? It's not strictly data science but it touches on technology as well. And it's just a great read on like what actually do we eat and how do we produce the food we have on the table today?
Jon Krohn: 01:25:01
Nice. It's so cool that all three of you had the book immediately available in your right hand, so our YouTube viewers could see visually all of your recommendations. It's rare that people have the book right on hand that they recommend. And so for all three of you, there's some, the probability of that, unless there's a, there's an, there's a latent factor in behind something to do with people at Syngenta reading a lot of books and always having them on hand. That I did not have in my model
Thomas Jung: 01:25:32
I promise it wasn't scripted.
Jeremy Groeteke: 01:25:37
Yeah, my bookcase is right behind me.
Jon Krohn: 01:25:39
And yeah, very last question before I let you go is how people can follow you after this episode. So, many wonderful insights from across the spectrum of data science and agriculture from all three of you. Feroz where can people follow you after the show?
Feroz Sheikh: 01:25:56
LinkedIn and Twitter. So, my profile is linkedin.com/in/sferoz, but we can publish those links. Yeah.
Jon Krohn: 01:26:05
Yeah, we'll have those in the show notes for sure.
Feroz Sheikh: 01:26:07
Yeah.
Jon Krohn: 01:26:08
Nice. So, LinkedIn for Feroz. Jeremy?
Jeremy Groeteke: 01:26:11
Same thing. LinkedIn and Twitter @groeteke out there on the Twitterverse.
Jon Krohn: 01:26:17
Nice, and Thomas?
Thomas Jung: 01:26:17
Same here. LinkedIn. And more than following me or any of us, please, please do follow what happens in agriculture, right? I mean, it is just such a critical topic for the planet, right? Regardless if this is the three of us, others, Syngenta, other places, I mean, agriculture is the thing to be in data science.
Jon Krohn: 01:26:37
Very nice. Well said, a great conclusion to a wonderful episode. Thanks to all three of you again. And yeah, we'll have to catch up in the future and see how these amazing technologies have disseminated further across the world and how they're making a big impact. Thank you so much.
Jeremy Groeteke: 01:26:54
Thanks, Jon.
Thomas Jung: 01:26:55
Thank you, Jon.
Feroz Sheikh: 01:26:55
Thank you, Jon.
Jon Krohn: 01:27:02
Wow-wee what a spectacularly fun and inspiring episode. In it, Feroz, Jeremy, and Thomas filled us in on how precision agriculture will increasingly provide plant-level care on highly granular spatial, temporal, and genetic terms. How computational agronomy automates what agronomists learn in school, for example, on how a plant stands tall, stays disease-free, and bears fruit. And then computational agronomy automates all of this with machine learning. We also talked about how generative chemistry supports the discovery of useful agricultural compounds that might have a structure quite different from what a human would design. How smart growth chambers provide hourly machine vision data in a garage-sized environment that will in the coming years and decades increasingly span entire farms. And we talked about how you could make a big social impact in your own data science career by solving agricultural problems such as through using drones to seed crops automatically.
01:27:53
As always, you can get all the show notes, including the transcript for this episode, the video recording, any materials mentioned on the show, the URLs for Feroz, Jeremy, and Thomas's social media profiles, as well as my own social media profiles at superdatascience.com/705. All right, thanks to my colleagues at Nebula for supporting me while I create content like this SuperDataScience episode for you. And thanks of course to Ivana, Mario, Natalie, Serg, Sylvia, Zara, and Kirill on the SuperDataScience team for producing another delicious episode for us today. For enabling that super team to create this free podcast for you. We are deeply grateful to our sponsors. You can support this show by checking out our sponsors' links which are in the show notes. Or you could rate or review the show on your favorite podcasting platform. You could like or comment on the episode on YouTube, or you could recommend the show to a friend or colleague whom you think would love it. But most importantly, I hope you just keep listening if you like, you can subscribe to be sure not to miss any awesome upcoming episodes.
01:28:50
All right, thank you. Cheers, I'm so grateful to have you tuning in, and I hope I can continue to make episodes you love for years and years to come. Until next time, my friend, keep on rocking it out there and I'm looking forward to enjoying another round of the SuperDataScience podcast with you very soon.
Show all
arrow_downward