CDT NGCM

Summer Academy

2021

In June 2021 the CDT-NGCM management team ran a Natural Language Processing course to further expand the range of training opportunities for our students; this course replaced the Summer Academy and was offered to NGCM students and students outside of the CDT. The course was very popular and spaces were booked very quickly; the information below was published prior to the course and is currently available here for reference purposes.

An Introduction to Natural Language Processing using Python

An introductory course to Natural Language Processing technologies using Python, aimed at students and researchers who want to extract valuable information from their text data.

The course offers a balance between theoretical foundations of NLP and practical examples using the Python programming language and its modern NLP ecosystem.

By attending this course, you will learn about:

• how to choose data representations to effectively work with text data

• exploratory techniques to quickly gain insights from your text data

• how to implement Machine Learning techniques to organise your documents into categories

• how to evaluate the quality of your models

• ideas on advanced applications using Natural Language data

The course will balance theoretical foundations with practical examples using the Python programming language.

No prior experience with libraries such as NLTK or scikit-learn is required for this course. Having existing experience with Python will be extremely beneficial but not required: users of other programming languages and tools (including e.g. Java, C++, C#, JavaScript, Matlab, Excel or Rlang) will find this course beneficial.

The training course was run by Bonzanini Consulting, who specialise in Data Science, Consultancy and Training. They offer a range of training courses, spanning from Python programming foundations to specialised Data Analytics and Machine Learning classes, and we are very pleased that they are able to provide this NLP training course.

The course took place online on Monday 14 June 2021 and Tuesday 15 June 2021. This course was for 2 full days (09:30 to 16:30) and attendance was mandatory for both days of the course.

2020

In September 2020 the CDT-NGCM management team decided to run a Geospatial Data and Analysis training course to further expand the range of training opportunities for our students; this course replaced the Summer Academy and was offered to NGCM students and students outside of the CDT. The information below was published prior to the course and is currently available here for reference purposes.

The course was run by Chris Jochem, a Senior Research Fellow with the WorldPop group (https://www.worldpop.org/) in the School of Geography and Environmental Science at the University of Southampton. Chris is a geographer. His research focuses on understanding the distribution, dynamics, and health of human populations around the world. He is particularly interested in developing spatial statistical and computational methods that integrate satellite data, surveys and other geospatial data layers.

Dr Warren C. Jochem

Geographic data seems to be everywhere now – from geo-tagged photos and tweets, GPS watches and fitness trackers, spatially-referenced survey data, even high-resolution satellite imagery. While such data presents many new opportunities to study our world, they also present unique challenges for analyses. The objective of this workshop was to introduce participants to the fundamentals of how to use, combine, analyse, and present spatial data while avoiding common pitfalls.

This two-day workshop used a mixture of short lectures and live demos to introduce concepts, and then the majority of time was spent in hands-on labs working with real datasets. Participants met virtually and had the opportunity to collaborate in small groups for the activities. All material used Python and several key packages (including geopandas, rasterio, pySAL, and folium). The workshop was intended for anyone wishing to gain familiarity with spatial data and core analysis techniques.

No previous experience with spatial data was expected. Participants needed experience with Python. Some knowledge of pandas and/or numpy was considered helpful but was not required.

Workshop topics included:

  • Spatial data formats, creating, reading and writing geometries and files
  • Projections, distortions, and coordinate reference systems
  • Joining, querying, clipping, buffering, intersecting and other common operations
  • Common analyses for cluster detection, density and distance calculations
  • Mapping and visualisation

The course was provided online over 2 days in September 2020.

Further details about the 2020 course can be found here, and the 2019 Summer Academy here, while blog posts detailing previous Summer Academies can be found here.

Priority will usually be given to Postgraduate Research students – anyone who wants to participate in any of our sessions but is not a PhD student should email cdt-ngcm@soton.ac.uk before registering. Any non-PGR students may be removed in order to provide space for PhD students.