Skip to main content

Generative Language AI Certificate

Boise State’s Computer Science department has made the decision to launch a certificate program tailored for students interested in Large Language Models (LLMs) from a technical standpoint. The program equips students with the technical skills to proficiently train the models, while comprehending the underlying semantics. Starting out students will take a intro to data science course as well as one machine learning course. From there, they will take a language processing class, where participants delve into the intricacies of LLMs. The goal of the program is to empower students to develop and deploy their own language models, this a big push as industry trends begin to steer away from using big closed models like ChatGPT or BARD. The emphasis is placed on fostering learning in creating independent programs and utilization of LMs.

Students who wish to complete the certificate are required to take the following courses:

Bubble map that shows the course the required schedule.
The bubble map shows the course the required schedule. CS 133 to CS 233 to CS 334 to CS 436, with Math 360/1 and Math 301 pointing to CS 334.

Information regarding each course

CS 133 – Foundations of Data Science: Introduction to Python programming and common Python data science libraries. Simple data visualization. Introduction to basic statistics including distributions and random sampling, testing statistical hypotheses, estimation, prediction, comparison, causality, and decisions. Introduction to classification methods.

  • Units: 3.00

 

CS 233 – Essentials of Data Science: Introduction to data formats, data collection, data manipulation, data visualization, data cleaning, and data analysis. Introduction to probability and information theory, basic database queries, supervised classification, interpretation of results, and ethical considerations.

  • PREREQ: CS 133
  • Units: 3.00

 

CS 334 – Algorithms of Machine Learning: Supervised classification, unsupervised classification, reinforcement learning, feature engineering, machine learning workflow, linear algebra for machine learning, evaluation metrics, survey of machine learning applications, ethical considerations.

  • PREREQ: CS 233 and (MATH 360 or MATH 361)
  • COREQ: MATH 301
  • Units: 3.00

 

CS 436 – Natural Language Processing: Probability theory, information theory, and linguistics. Machine learning techniques applied to language data, including generative and discriminative classification related to language modeling, syntactic parsing, sequence tagging, and lexical semantics.

  • Units: 3.00
  • Covered LLMs(Language Learning Models) since 2021

 

Required Math courses

Math 360 – Engineering Statistics: Calculus-based survey of statistical techniques used in Engineering. Data collection and organization, basic probability distributions, sampling, confidence intervals, hypothesis testing, process control, simple regression techniques, design of experiments. Emphasis on examples and applications to engineering, including product reliability, robust design and quality control. Credit cannot be earned for both MATH 360 and MATH 361.

  • Units: 3.00

OR

Math 361 – Probability and Statistics I: Calculus-based treatment of probability theory, random variables, distributions, conditional probability, central limit theorem, descriptive statistics, estimation, tests of hypotheses, and regression. Differs from MATH 360 by providing more thorough coverage of theoretical foundations and wider variety of applications drawn from natural and social sciences as well as engineering. Credit cannot be earned for both MATH 360 and MATH 361.

  • Units: 3.00

 

Math 301 – Introduction to Linear Algebra: Linear algebra from a matrix perspective with applications from the applied sciences. Topics include the algebra of matrices, methods for solving linear systems of equations, eigenvalues and eigenvectors, matrix decompositions, vector spaces, linear transformations, least squares, and numerical techniques.

  • Units: 3.00