From fc93781c53619241f98024707a77234ee6337cef Mon Sep 17 00:00:00 2001 From: BUI Van Tuan <buivantuan07@gmail.com> Date: Thu, 25 Jan 2024 15:17:42 +0100 Subject: [PATCH] add a conflict use-case --- 0-Setup.md | 98 ++++++++++++++++++------ a-MSE.md | 91 ++++++++++++++++++---- c-LOGGING.md | 210 ++++++++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 358 insertions(+), 41 deletions(-) diff --git a/0-Setup.md b/0-Setup.md index 41491d5..785c536 100644 --- a/0-Setup.md +++ b/0-Setup.md @@ -1,7 +1,34 @@ # Environment setup +## Configure Git + +To start using Git from your computer, you must enter your credentials to identify yourself as the author of your work. The full name and email address should match the ones you use in GitLab. + +1. In your shell, add your full name: + + ```bash + git config --global user.name "John Doe" + ``` + +2. Add your email address: + + ```bash + git config --global user.email "your_email_address@example.com" + ``` + +3. To check the configuration, run: + + ```bash + git config --global --list + ``` + + The `--global` option tells Git to always use this information for anything you do on your system. If you omit `--global` or use `--local`, the configuration applies only to the current repository. + + ## SSH key setup +GitLab uses the SSH protocol to securely communicate with Git. When you use SSH keys to authenticate to the GitLab remote server, you don’t need to supply your username and password each time. Given that Gitlab LISN (https://gitlab.lisn.upsaclay.fr/) restricts external connections via SSH (such as those using Wifi), so in this TP, you are going to use [HTTPS](#https-setup). + To clone the project code, you'll need to use Git. To do this, you first need to set up an SSH key pair and import it into your Gitlab account. To do this, follow the steps below: If you do not have an existing SSH key pair, generate a new one: @@ -35,6 +62,25 @@ Next, add an SSH key to your GitLab account; 8. In the Title box, type a description, like `Gitlab workshop`. 9. Select Add key. +## HTTPS setup + +Clone with HTTPS when you want to authenticate each time you perform an operation between your computer and GitLab. But you can avoid to enter your username and password every time by cloning with HTTPS using a token: + +```bash +git clone https://<username>:<token>@gitlab.lisn.upsaclay.fr/[namespace]/scoring.git +``` +You can create as many personal access tokens as you like. + +1. On the left sidebar, select your avatar. +2. Select Edit profile. +3. On the left sidebar, select Access Tokens. +4. Select Add new token. +5. Enter a name (like **Gitlab Workshop Token**) and expiry date for the token (set to 365 days later than the current date). +6. Select `read_repository` and `write_repository` permissions. +7. Select Create personal access token. + +Save the personal access token somewhere safe. After you leave the page, you no longer have access to the token. + ## Setting up working groups To simulate a team, we invite you to form groups of 4 people, trying to mix people from different laboratories. One member of the group will be "Project Leader", creating a fork in your personal namespace: @@ -49,23 +95,32 @@ GitLab creates your fork, and redirects you to the new fork’s page. The `main` branch stores the official release history, and the `develop` branch serves as an integration branch for features. The Project leader creates a `develop` branch locally and push it to the server: * Using SSH if utilizing Ethernet connection: -```bash -git clone git@serveur-gitlab.lisn.upsaclay.fr:[namespace]/scoring.git -cd scoring -``` + ```bash + git clone git@serveur-gitlab.lisn.upsaclay.fr:[namespace]/scoring.git + cd scoring + ``` * Using HTTPS if utilizing Wifi connection: -```bash -git clone https://gitlab.lisn.upsaclay.fr/[namespace]/scoring.git -cd scoring -``` + ```bash + git clone https://<username>:<token>@gitlab.lisn.upsaclay.fr/[namespace]/scoring.git + cd scoring + ``` -```bash -git branch develop -git push -u origin develop -``` +* Creating a `develop` branch + ```bash + git branch develop + ``` -This branch will contain the complete history of the project, whereas `main` will contain an abridged version. Other developers should now clone the central repository and create a tracking branch for develop. +* Pushing the commits from your local `develop` branch to the `develop` branch in the remote repository + ```bash + git push origin develop + ``` + + `origin`: this is the default name Git gives to the remote repository from which your local repository was cloned (or the primary remote repository you're working with). It's like a nickname for the URL of the remote repository. You can have multiple remotes with different names, but `origin` is the conventional default. + + `develop`: this specifies the branch that you want to push. + +This branch will contain the complete history of the project, whereas `main` will contain an abridged version. Other developers should now clone the central repository and switch to develop. Now, add your collaborators to the project: 1. Select Manage > Members. @@ -80,20 +135,19 @@ Now, add your collaborators to the project: 1. Clone the repository * Using SSH if utilizing Ethernet connection: -```bash -git clone git@serveur-gitlab.lisn.upsaclay.fr:[namespace]/scoring.git -cd scoring -``` + ```bash + git clone git@serveur-gitlab.lisn.upsaclay.fr:[namespace]/scoring.git + cd scoring + ``` * Using HTTPS if utilizing Wifi connection: -```bash -git clone https://gitlab.lisn.upsaclay.fr/[namespace]/scoring.git -cd scoring -``` + ```bash + git clone https://<username>:<token>@gitlab.lisn.upsaclay.fr/[namespace]/scoring.git + cd scoring + ``` The developer should create feature branchs from the latest `develop` branch. ## Resources and useful links * SSH keys: https://docs.gitlab.com/ee/user/ssh.html * Gitflow workflow: https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow - diff --git a/a-MSE.md b/a-MSE.md index 3a49710..10465c4 100644 --- a/a-MSE.md +++ b/a-MSE.md @@ -25,12 +25,21 @@ $$ MSE = \frac{1}{N} \sum_{i=1}^{N} (predicted_i - actual_i)² $$ - Create a feature `a_MSE` branch: -```bash -git pull origin develop -git checkout develop -git checkout -b a_MSE -``` + * Fetch the changes from the `develop` branch of the remote repository and then merge those changes into the `develop` branch of my local repository. Because you need to create a feature branch from the **latest** `develop` branch. + ```bash + git pull origin develop + ``` + + * Switch my current working directory to the `develop` branch. + ```bash + git checkout develop + ``` + + * Create a new branch named `a_MSE` and then switch your current working directory to this new branch. + ```bash + git checkout -b a_MSE + ``` </p> <p> @@ -106,11 +115,37 @@ if __name__ == "__main__": - Commit and push changes: -```bash -git add score.py -git commit -m "implement MSE metric" -git push origin a_MSE -``` + * `git status` + ```bash + On branch a_MSE + Changes not staged for commit: + (use "git add <file>..." to update what will be committed) + (use "git restore <file>..." to discard changes in working directory) + modified: score.py + + no changes added to commit (use "git add" and/or "git commit -a") + ``` + + When you modify a file in your working directory, Git notices that a file has changed but it's not yet in the staging area. These are "Changes not staged for commit.". The staging area is a layer that sits between your working directory and the repository. It's where you prepare and organize your changes before actually committing them to the project history. + + * Move these changes to the staging area: + ```bash + git add score.py + ``` + + Once changes are staged, they will be part of your next commit. + + * Commit these changes: + ```bash + git commit -m "implement MSE metric" + ``` + + After committing, these changes move from the staging area to your repository. + + * Pushing the commit from your local `a_MSE` branch to the `a_MSE` branch in the remote repository: + ```bash + git push origin a_MSE + ``` </p> </details> @@ -191,12 +226,38 @@ pytest - Commit and push changes: -```bash -git add tests/test_metrics.py -git commit -m "implement unit tests for metrics" -git push origin a_MSE -``` + * `git status` + ```bash + On branch a_MSE + Changes not staged for commit: + (use "git add <file>..." to update what will be committed) + (use "git restore <file>..." to discard changes in working directory) + modified: tests/test_metrics.py + + no changes added to commit (use "git add" and/or "git commit -a") + ``` + When you modify a file in your working directory, Git notices that a file has changed but it's not yet in the staging area. These are "Changes not staged for commit.". The staging area is a layer that sits between your working directory and the repository. It's where you prepare and organize your changes before actually committing them to the project history. + + * Move these changes to the staging area: + ```bash + git add tests/test_metrics.py + ``` + + Once changes are staged, they will be part of your next commit. + + * Commit these changes: + ```bash + git commit -m "implement unit tests for metrics" + ``` + + After committing, these changes move from the staging area to your repository. + + * Pushing the commit from your local `a_MSE` branch to the `a_MSE` branch in the remote repository: + ```bash + git push origin a_MSE + ``` + </p> </details> diff --git a/c-LOGGING.md b/c-LOGGING.md index f2db9dd..2dd1e69 100644 --- a/c-LOGGING.md +++ b/c-LOGGING.md @@ -1,14 +1,14 @@ # Logging -Implement logging to capture the process and any potential errors. This will help in debugging and maintenance. +Implement `logging` to capture the process and any potential errors. This will help in debugging and maintenance. We are now going to construct a scenario in which a merge conflict arises, and we will resolve it. This will involve both the developer and the tester concurrently working on implementing the `logging` feature. ## Implement logging <details> -<summary>Solution</summary> +<summary>Scenario</summary> <p> -- Create a feature `c_logging` branch: +- The developer and tester create a feature `c_logging` branch: ```bash git checkout develop @@ -20,7 +20,7 @@ git checkout -b c_logging <p> -- `score.py`: +- The developer pushes a new `score.py` with logging: ```python #!/usr/bin/env python3 @@ -146,6 +146,208 @@ if __name__ == "__main__": <p> +- The tester pushes a new `score.py` with logging: + +```python +#!/usr/bin/env python3 +# -*- coding: utf-8 -*- +import os +import sys +from sys import argv +import numpy as np +import logging + +# Configure logging +logging.basicConfig(filename='score.log', filemode='w', + format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', + level=logging.DEBUG) + + +# ========= Useful functions ============== +def read_array(filename): + ''' Read array and convert to 2d np arrays ''' + logging.info(f"Attempting to read file: {filename}") + + if not os.path.exists(filename): + logging.error(f"The file {filename} does not exist.") + raise FileNotFoundError(f"The file {filename} does not exist.") + + formatted_array = [] + + with open(filename, 'r') as file: + for line_num, line in enumerate(file, start=1): + # Split the line into elements and strip whitespace + elements = line.strip().split() + + # Check if there are exactly three elements + if len(elements) != 3: + logging.error(f"Error in {filename}, line {line_num}: Expected 3 elements, found {len(elements)}") + raise ValueError(f"Error in {filename}, line {line_num}: Expected 3 elements, found {len(elements)}") + + # Check if all elements are either '0' or '1' + if not all(elem in ['0', '1'] for elem in elements): + logging.error(f"Error in {filename}, line {line_num}: Elements must be '0' or '1'") + raise ValueError(f"Error in {filename}, line {line_num}: Elements must be '0' or '1'") + + # Convert elements to integers and add to the array + formatted_array.append([int(elem) for elem in elements]) + + logging.info(f"File {filename} read successfully.") + # Convert the list to a numpy array + return np.array(formatted_array) + + +def accuracy_metric(solution, prediction): + logging.debug("Calculating accuracy metric") + if len(solution) == 0 or len(prediction) == 0: + logging.warning("Received empty array(s) in accuracy_metric") + return 0 + correct_samples = np.all(solution == prediction, axis=1) + accuracy = np.mean(correct_samples) + logging.info(f"Accuracy metric calculated successfully: {accuracy}") + return accuracy + + +def mse_metric(solution, prediction): + '''Mean-square error. + Works even if the target matrix has more than one column''' + logging.debug("Calculating MSE metric") + if len(solution) == 0 or len(prediction) == 0: + logging.warning("Received empty array(s) in mse_metric") + return 0 + mse = np.sum((solution - prediction)**2, axis=1) + mse = np.mean(mse) + logging.info(f"MSE metric calculated successfully: {mse}") + return mse + + +def _HERE(*args): + h = os.path.dirname(os.path.realpath(__file__)) + return os.path.join(h, *args) + + +# =============================== MAIN ======================================== +if __name__ == "__main__": + + #### INPUT/OUTPUT: Get input and output directory names + try: + logging.debug("Score execution started.") + prediction_file = argv[1] + solution_file = argv[2] + # Read the solution and prediction values into numpy arrays + solution = read_array(solution_file) + prediction = read_array(prediction_file) + except IndexError: + logging.error("Incorrect usage: script requires two arguments for prediction and solution files.") + print("Usage: script.py predict.txt solution.txt") + sys.exit(1) + except (FileNotFoundError, IOError) as e: + logging.error(e) + sys.exit(1) + + score_file = open(_HERE('scores.txt'), 'w') + # # Extract the dataset name from the file name + prediction_name = os.path.basename(prediction_file) + + # Check if the shapes of the arrays are compatible + if prediction.shape != solution.shape: + logging.error("Error: Prediction and solution arrays have different shapes.") + sys.exit(1) + + # Compute the score prescribed by the metric file + accuracy_score = accuracy_metric(solution, prediction) + mse_score = mse_metric(solution, prediction) + print( + "======= (" + prediction_name + "): score(accuracy_metric)=%0.2f =======" % accuracy_score) + print( + "======= (" + prediction_name + "): score(mse_metric)=%0.2f =======" % mse_score) + # Write score corresponding to selected task and metric to the output file + score_file.write("accuracy_metric: %0.2f\n" % accuracy_score) + score_file.write("mse_metric: %0.2f\n" % mse_score) + score_file.close() + logging.info("Score completed successfully") +``` + +</p> + +<p> + +- Whoever pushes first will see an error, meaning that the updates are rejected. + +```bash +To https://gitlab.lisn.upsaclay.fr/tuanbui/scoring.git + ! [rejected] c_logging -> c_logging (fetch first) +error: failed to push some refs to 'https://gitlab.lisn.upsaclay.fr/tuanbui/scoring.git' +hint: Updates were rejected because the remote contains work that you do +hint: not have locally. This is usually caused by another repository pushing +hint: to the same ref. You may want to first integrate the remote changes +hint: (e.g., 'git pull ...') before pushing again. +hint: See the 'Note about fast-forwards' in 'git push --help' for details. +``` + +This means someone else has pushed changes to the remote repository that you don't have in your local branch. This situation often occurs in collaborative environments where multiple people are pushing to the same repository or the same branch within a repository. Your push was rejected to prevent overwriting those changes. Before you can push your changes, you need to fetch the latest changes from the remote repository and merge them into your local branch. + +</p> + +<p> + +- Merge the remote changes into your local branch `git pull origin c_logging`: + +```bash +remote: Enumerating objects: 3, done. +remote: Counting objects: 100% (3/3), done. +remote: Compressing objects: 100% (3/3), done. +remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0 +Unpacking objects: 100% (3/3), 2.08 KiB | 2.08 MiB/s, done. +From https://gitlab.lisn.upsaclay.fr/tuanbui/scoring + * branch c_logging -> FETCH_HEAD + * [new branch] c_logging -> origin/c_logging +Auto-merging score.py +CONFLICT (content): Merge conflict in score.py +Automatic merge failed; fix conflicts and then commit the result. +``` + +Currently, there are visible conflicts that need to be resolved manually. + +</p> + +<p> + +- Generate a list of the files affected by the merge conflict `git status`: + +```bash +On branch c_logging +You have unmerged paths. + (fix conflicts and run "git commit") + (use "git merge --abort" to abort the merge) + +Unmerged paths: + (use "git add <file>..." to mark resolution) + both modified: score.py + +no changes added to commit (use "git add" and/or "git commit -a") +``` + +Open your favorite text editor, such as [Visual Studio Code](https://code.visualstudio.com/), and navigate to the file that has merge conflicts. To see the beginning of the merge conflict in your file, search the file for the conflict marker `<<<<<<<`. When you open the file in your text editor, you'll see the changes from the HEAD or base branch after the line `<<<<<<< HEAD`. Next, you'll see `=======`, which divides your changes from the changes in the other branch, followed by `>>>>>>> BRANCH-NAME/COMMIT`. In this example, one person wrote "logging.info(f"Attempting to read file: {filename}")" in the base or HEAD branch and another person wrote "logging.debug(f"Attempting to read file: {filename}")" in the compare branch or commit. + +```python +# ========= Useful functions ============== +def read_array(filename): + ''' Read array and convert to 2d np arrays ''' +<<<<<<< HEAD + logging.info(f"Attempting to read file: {filename}") +======= + logging.debug(f"Attempting to read file: {filename}") +>>>>>>> 7890dd5cff2eef84a2c70174e5a8846beaa8bf78 + + if not os.path.exists(filename): +``` + +Decide if you want to keep only your changes, keep only the other changes, or make a brand new change, which may incorporate changes from both. Delete the conflict markers `<<<<<<<`, `=======`, `>>>>>>>` and make the changes you want in the final merge. In this example, keep only "logging.debug(f"Attempting to read file: {filename}")" for a debug message. Add or stage your changes (`git add score.py`) and commit your changes with a comment (`git commit -m "Resolve merge conflict by keeping the debug message"`). Finally, push the final version. +</p> + +<p> + - See the logs: ```bash -- GitLab