-
Malvin Chevallier authoredMalvin Chevallier authored
Library overview
PyTorch
PyTorch is an open-source machine learning framework primarily used for deep learning and artificial intelligence tasks. It provides tools for building and training neural networks, with an emphasis on flexibility and ease of use, especially through dynamic computational graphs. PyTorch is widely used for tasks like image classification, natural language processing, and reinforcement learning, offering support for GPUs to accelerate computations.
Tensorflow
TensorFlow is an open-source machine learning framework developed by Google, designed for building and deploying machine learning models. It supports both deep learning and traditional machine learning algorithms, using static computational graphs for optimized performance. TensorFlow is commonly used for tasks like image recognition, natural language processing, and large-scale machine learning applications, and it has tools for deploying models on various platforms, including mobile and web.
nnet
Nnet is a R package used for training feed-forward neural networks with a single hidden layer. It's primarily used for classification and regression tasks, offering a simpler neural network model compared to more advanced deep learning frameworks. The nnet package is particularly useful for modeling nonlinear relationships in data, and is often applied to small-scale machine learning problems in research and practical applications.
DeepLearning4j
DeepLearning4j (DL4J) is an open-source, distributed deep learning library for the Java and JVM ecosystem. It supports building and training neural networks and is designed to work in business environments that require large-scale deep learning applications. DL4J is used for tasks like image recognition, natural language processing, and time series analysis, and it integrates with popular big data tools like Hadoop and Spark for handling large datasets. Its focus on scalability and performance makes it suitable for enterprise-level AI solutions.
TensorFlow.js
TensorFlow.js is a JavaScript library that allows you to build, train, and run machine learning models directly in the browser or in Node.js environments. It enables developers to leverage machine learning in web applications without needing backend servers, making it ideal for interactive, client-side tasks like real-time image processing, object detection, and natural language processing. TensorFlow.js also supports running pre-trained models, making it easy to integrate AI into JavaScript-based applications.
Comparison Table of Libraries/APIs/DSLs for Deep Learning
Features | PyTorch | TensorFlow | nnet | DeepLearning4j | TensorFlow.js |
---|---|---|---|---|---|
Supported Languages | Python, C++ (LibTorch) | Python, C++ | R | Java, Scala | JavaScript (browser and Node.js) |
Supported Neural Network Types | Feedforward, CNN, RNN, LSTM, GAN, Transformers, etc. | Feedforward, CNN, RNN, LSTM, GAN, Transformers, etc. | Feedforward neural networks | Feedforward, CNN, RNN, LSTM, etc. | CNN, RNN, LSTM, etc. |
Programming Paradigm | Imperative (dynamic graph definition) | Imperative and declarative (static graphs and Eager Execution) | Functional | Imperative | Imperative |
Ease of Use | Very high, intuitive syntax close to standard Python | Good, but can be complex due to computational graphs | Simple for basic networks, less so for complex models | Moderate, requires good knowledge of Java/Scala | High, especially for JavaScript developers |
Performance | High, native GPU support with CUDA | Very high, optimizations for GPU and TPU | Moderate, suitable for small datasets | High, optimized for production deployment | Variable, depends on browser or Node.js performance |
Hardware Support | CPU, GPU (CUDA), initial support for TPUs via XLA | CPU, GPU, TPU | CPU only | CPU, GPU | CPU, GPU via WebGL or WebGPU |
Community and Ecosystem | Large active community, numerous tutorials and forums | Very large community, supported by Google, vast ecosystem | Smaller community, mainly academic | Growing community, integration with Java ecosystem | Active community, benefits from web ecosystem |
License | Open-source (BSD) | Open-source (Apache 2.0) | Open-source (GPL-2) | Open-source (Apache 2.0) | Open-source (Apache 2.0) |
Installation | Via pip for Python, binary downloads for LibTorch in C++ |
Via pip for Python, manual compilations or binaries for C++ |
Via install.packages("nnet") in R |
Via dependency managers like Maven or Gradle | Via npm for Node.js, scripts for the browser |
Documentation | Excellent, clear documentation with practical examples | Very comprehensive, but can be complex for beginners | Good, but less exhaustive | Good documentation, examples for production integration | User-friendly documentation with numerous examples |
Visualization Tools | Integration with TensorBoard via torch.utils.tensorboard
|
TensorBoard for graph and metric visualization | Limited, requires external packages | Integration with Arbiter for model tracking | Support for TensorBoard via TensorFlow.js extension |
Integration with Other Libraries | Compatible with NumPy, SciPy, and other Python libraries | Integration with Keras, NumPy, supports TensorFlow Hub models | Compatible with other R packages | Integration with Hadoop, Spark, and other Big Data tools | Can import existing TensorFlow and Keras models |
Unsupervised Learning | Support via additional modules | Wide support, including reinforcement learning | Limited to simple neural networks | Support for unsupervised models like autoencoders | Support for unsupervised models adapted to the web |
Maturity | Very mature, used in production by large companies | Very mature, one of the first deep learning libraries | Mature for simple applications | Mature with a focus on enterprise applications | Relatively new but rapidly evolving |
Commercial Support | Supported by Meta (Facebook), with community contributions | Supported by Google, with commercial support options via Google Cloud | Mainly community support | Commercial support available through partners | Supported by Google, integration with cloud services |
Metamodel
Click on metamodel.svg on the root of the repository to visualize the metamodel
LLM experiment
Pytorch
Using ChatGPT version 4-o, with the prompt Give me an example of python code using pytorch neural network for classification
we obtain the following code
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np
# Load and preprocess the Iris dataset
iris = load_iris()
X = iris['data'] # Features
y = iris['target'] # Labels (Classes)
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Standardize the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Convert data to PyTorch tensors
X_train = torch.tensor(X_train, dtype=torch.float32)
X_test = torch.tensor(X_test, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.long)
y_test = torch.tensor(y_test, dtype=torch.long)
# Define the Neural Network Model
class IrisNet(nn.Module):
def __init__(self):
super(IrisNet, self).__init__()
self.fc1 = nn.Linear(4, 16) # Input layer -> Hidden layer 1
self.fc2 = nn.Linear(16, 8) # Hidden layer 1 -> Hidden layer 2
self.fc3 = nn.Linear(8, 3) # Hidden layer 2 -> Output layer (3 classes)
def forward(self, x):
x = torch.relu(self.fc1(x))
x = torch.relu(self.fc2(x))
x = self.fc3(x) # Output layer (no softmax because CrossEntropyLoss expects raw logits)
return x
# Initialize the model, loss function, and optimizer
model = IrisNet()
criterion = nn.CrossEntropyLoss() # For multi-class classification
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Train the model
num_epochs = 100
for epoch in range(num_epochs):
model.train()
# Forward pass
outputs = model(X_train)
loss = criterion(outputs, y_train)
# Backward pass and optimization
optimizer.zero_grad()
loss.backward()
optimizer.step()
if (epoch + 1) % 10 == 0:
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')
# Test the model
model.eval() # Set model to evaluation mode
with torch.no_grad():
test_outputs = model(X_test)
_, predicted = torch.max(test_outputs, 1)
accuracy = (predicted == y_test).sum().item() / y_test.size(0)
print(f'Accuracy on test set: {accuracy * 100:.2f}%')
Analysis of the result
There wasn't any errors during the code interpretation. The code ran well. Here are the results:
Epoch [10/100], Loss: 1.1041
Epoch [20/100], Loss: 1.0783
Epoch [30/100], Loss: 1.0541
Epoch [40/100], Loss: 1.0287
Epoch [50/100], Loss: 0.9986
Epoch [60/100], Loss: 0.9607
Epoch [70/100], Loss: 0.9155
Epoch [80/100], Loss: 0.8622
Epoch [90/100], Loss: 0.8021
Epoch [100/100], Loss: 0.7376
Accuracy on test set: 70.00%
However after several run we observed that there was a high accuracy variance between different run. Accuracy varies between 70% and 90%.
Example
Let's understand how does example_3 works !
The first line is model = model3
, this create a new identifier that enable to know to which model we should apply a property, traduction in python doesn't really exist, it's basically instance of our Model so model = Model3()
with Model3 a class that contains our neural networks.
The second line loadData(dataset3)
enable to load dataset a put it into variables, we can notice that user doesn't use split() that's why trainset = testset, it's represented with
data = pd.read_csv('your_data.csv')
X = data.iloc[:, :-1].values
y = data.iloc[:, -1].values
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.long)
train_dataset = TensorDataset(X, y)
test_dataset = TensorDataset(X, y)
train_loader = DataLoader(train_dataset, batch_size=batch_size)
test_loader = DataLoader(test_dataset, batch_size=batch_size)
The third line configure {
define all the configuration , there is no representation of this line in python or other languages.
The fourth line model3.setHyperparameter(epochs=15, lr=0.01, batchSize=32)
setup all important hyperparameters that will be used for the training part, we traduce in python with an instanciation of variable
The fifth line model3.compile(linear(256, 128), applySigmoid(0.6), linear(128, 10), applySoftmax(0.3))
define the model and all his layers, to do that in python we should create a class Model3 like that :
class Model3(nn.Module):
def __init__(self):
super(Model3, self).__init__()
self.layer1 = nn.Linear(256, 128)
self.layer2 = nn.Linear(128, 10)
self.sigmoid = nn.Sigmoid()
self.softmax = nn.Softmax()
def forward(self, x):
x = self.layer1(x)
x = self.sigmoid(x)
x = self.layer2(x)
x = self.softmax(x)
return x
The sixth line model1.SGDoptimize(0.001)
is represented in python with optimizer = optim.SGD(model3.parameters(), lr=0.01)
The seventh line model1.setLoss(crossEntropy)
is represented with criterion = nn.CrossEntropyLoss()
The heighth line model3.train(dataset3)
is about training process, to do that we need to create while loop and use all previous declared variables, it's represented with this function
def train_model(model, dataset, epochs=15, batch_size=32):
model.train()
for epoch in range(epochs):
for inputs, targets in train_loader:
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, targets)
loss.backward()
optimizer.step()
print(f"Epoch {epoch+1}/{epochs}, Loss: {loss.item()}")
train_model(model3, dataset3)
The last line model3.evaluate(dataset3)
is about evaluation process, same thing as training, we need to create a function and use all previous parameters, it's represented with
def evaluate_model(model, dataset):
model.eval()
correct, total = 0, 0
with torch.no_grad():
for inputs, targets in test_loader:
outputs = model(inputs)
_, predicted = torch.max(outputs, 1)
total += targets.size(0)
correct += (predicted == targets).sum().item()
accuracy = correct / total
print(f"Accuracy: {accuracy * 100:.2f}%")
evaluate_model(model3, dataset3)