| Module | Description |
|---|---|
| gcp | Google Cloud infrastructure module |
| gcp/k8s | MongoDB replica set kubernetes deployment module |
| gcp/terraform | Terraform GKE cluster module |
| results-usl | Final scalability results with the factor write concern |
| results-usl/experimentUSL1 | Scalability results with the factor write concern w = majority |
| results-usl/experimentUSL2 | Scalability results with the factor write concern w = 1 |
| results-usl/experimentUSL3 | Scalability results with the factor write concern w = majority (6 replica nodes) |
| results-usl/experimentUSL4 | Scalability results with the factor write concern w = 1 (6 replica nodes) |
| results-usl/workload1 | Scalability results with the factor experiment 1 and 2 |
| results-v4 | Final experiment results with 7 factors and 2 levels (5 repetitions) |
| results-v3 | Third experimental results attempt with 7 factors and 2 levels (5 repetitions) |
| results-v2 | Second experimental results attempt with 7 factors and 2 levels (5 repetitions) |
| results-v1 | First experimental results attempt with 7 factors and 2 levels (5 repetitions) |
| workloads | Workload definition files |
| logs | Logs modules |
| runner | Workload runner script |
| Dockerfile | Runner with ycsb dockerfile |
| Docker Compose File | Docker Swarm cluster definition |
| concierge | Workload module cleaner |
| janitor | Database cleaner |
| moca | Our own benchmark tool attempt |
| populate | Populate database script |
| results-aggregator | Tool for experiment's results aggregation |
| get_results | Copies experiment from the cloud environment and aggregates results |
| rebuild_pods | Rebuild MongoDB Kubernetes Service |
| Factor | Level -1 | Level 1 |
|---|---|---|
| Write Concern (A) | Majority | 1 Ack |
| Replica Writer Thread Range (B) | [0:16] Threads | [0:128] Threads |
| Read Concern (C) | 1 Ack | Majority |
| Read Preference (D) | Primary Preferred | Secondary Preferred |
| Replica Batch Limit (E) | 50MB | 100MB |
| Replica Node Configuration (F) | Primary-Secondary-Secondary | Primary-Secondary-Arbiter |
| Chaining (G) | Disabled | Enabled |
| Module | Description |
|---|---|
| outputs | YCSB folder output |
| results-throughput.dat | Insert operation latency results |
| results-latency-insert.dat | Insert operation latency results |
| results-latency-read.dat | Read operation latency results |
| results-latency-scan.dat | Scan operation latency results |
| results-latency-update.dat | Update operation latency results |
All of these 3 factors are given directly to our runner script, as they are part of the MongoDB connection string per client request:
./runner.sh <other-flags> -W <write-concern> -R <read-concern> -P <read-preference>-W majority-W 1-R local-R majority-P primaryPreferred-P secondaryPreferredAll of these 2 factors are setup in the MongoDB kubernetes deployment yaml file, as server parameters.
- "--setParameter"
- "replWriterMinThreadCount=0"
- "--setParameter"
- "replWriterThreadCount=16" - "--setParameter"
- "replWriterMinThreadCount=0"
- "--setParameter"
- "replWriterThreadCount=128" - "--setParameter"
- "replBatchLimitBytes=52428800" - "--setParameter"
- "replBatchLimitBytes=104857600"Create the replica set, for example replica set named "rs0" with mongo-0 as primary, mongo-1 and mongo-2 as secondaries:
kubectl exec mongo-0 -- mongo --eval 'rs.initiate({_id: "rs0", version: 1, members: [{_id: 0, host: "mongo-0.mongo:27017"}, {_id: 1, host: "mongo-1.mongo:27017"}, {_id: 2, host: "mongo-2.mongo:27017"}]});'
Create the replica set, for example replica set named "rs0" with mongo-0 as primary, mongo-1 as secondary and mongo-2 as arbiter:
kubectl exec mongo-0 -- mongo --eval 'rs.initiate({_id: "rs0", version: 1, members: [{_id: 0, host: "mongo-0.mongo:27017"}, {_id: 1, host: "mongo-1.mongo:27017"}, {_id: 2, host: "mongo-2.mongo:27017", arbiterOnly: true}]});'
Simply create the replica set with the setting chainingAllowed set to false (members array is redacted for legibility reasons):
kubectl exec mongo-0 -- mongo --eval 'rs.initiate({_id: "rs0", version: 1, members: [...], settings: {chainingAllowed: false}});'Create the replica set with the setting chainingAllowed set to true (members array is redacted for legibility reasons):
kubectl exec mongo-0 -- mongo --eval 'rs.initiate({_id: "rs0", version: 1, members: [...], settings: {chainingAllowed: true}});'And force one of the secondaries to utilize the other secondary as its sync source, in this example we are forcing mongo-2 to sync from mongo-1 and mongo-0 is the primary:
kubectl exec mongo-2 -- mongo --eval 'db.adminCommand( { replSetSyncFrom: "mongo-1.mongo:27017" });'Provision the infrastructure:
cd gcp/terraform
terraform applyConnect to the cluster by getting the command line access from GCP console, like:
gcloud container clusters get-credentials <gcloud_credentials> --region <region> --project <project_id>To watch the creation of the pods (optional):
watch -x kubectl get podsClean existing environment (if already existing) and create StatefulSet, Service. Also initiates replica set with different system parameters like chaining and architecture (PSS or PSA) as booleans:
cd ..
./rebuild_pods.sh -c <chaining_enabled> -a <arbiter_exists>Run pod with our ycsb image hosted @ dockerhub:
kubectl run ycsb --rm -it --image aaugusto11/ycsb -- /bin/bashOr build a local image of ycsb and run the pod:
cd ../../
docker build -t ycsb:latest .
kubectl run ycsb --rm -it --image ycsb:latest --image-pull-policy=Never -- /bin/bashRun the script to perform benchmark experiment:
./runner.sh -w workload1 -e experiment1 -i 5 -c 1 -x throughput -m 16 -n 16 -s 1 -r 5 -W 1 -R majority -P primaryThis will run workload1, as the experiment with id 1, perform 5 iterations, on the cloud (-c 1), from 16 to 16 client threads in increments of 1, repeating each run of the workload 5 times. Each request is being done using writeConcern w = 1 and readConcern = majority, reading from the primary.
Run the script to perform scalability experiment:
./runner.sh -w workload1 -e experiment2 -i 1 -c 1 -x throughput -m 1 -n 100 -s 5 -r 5 -W 1 -R local -P primaryThis will run workload1, as the experiment with id 2, perform 1 iteration, on the cloud (-c 1), from 1 to 100 client threads in increments of 5, repeating each run of the workload 5 times. Each request is being done using writeConcern w = 1 and readConcern = local, reading from the primary.
If running on the cloud, copy experiments folder from the pod to the local environment:
kubectl cp default/ycsb:/experiments/experiment1 ./results/experiment1 -c ycsbGroup 01
| Number | Name | User | |
|---|---|---|---|
| 90704 | Andre Augusto | https://github.com/AndreAugusto11 | mailto:andre.augusto@tecnico.ulisboa.pt |
| 90744 | Lucas Vicente | https://github.com/WARSKELETON | mailto:lucasvicente@tecnico.ulisboa.pt |
| 90751 | Manuel Mascarenhas | https://github.com/Mascarenhas12 | mailto:manuel.d.mascarenhas@tecnico.ulisboa.pt |