Ansible changes removes local caches - intended?
Hi ALA folks
With this commit: https://github.com/gbif/ala-install/commit/d5b6394f44aa82451079d84f646e200ca...
Things like the following were removed:
-- name: determine if the application is in our local repo - local_action: stat path={{local_repo_dir}}/{{biocache_service}}.war - register: biocache_war_path - -- name: fetch application if it is not in our local repo - local_action: get_url url={{biocache_service_url}} dest={{local_repo_dir}}/{{biocache_service}}.war - when: biocache_war_path.stat.exists == false
The original design was to pull the large artifacts to a local directory once (the .ala folder) so that if you were provisioning different environments (locally, uat, production etc) you weren’t troubled with huge download times for each environment.
Was this an oversight perhaps? In Europe we see 10s of minutes in this stage, hence the original design to cache locally.
If we get into proper maven releases and versioned artifacts deployed to nexus, we can start mirroring nexus repositories to bring back data locality too (plus it’ll use the local .m2 repo directory on the host).
Any thoughts on this? Would you like me to revert those changes?
Thanks, Tim
Hi Tim,
I made that change during a big refactor - but for selfish reasons. Use of local repo from my home internet connection to external VMs (EC2 and Nectar - an Aussie provider) made running the scripts from scratch too slow to use/test. Testing with vagrant is very useful, but up until point. I think Australia lacks the internet speeds (particularly upload speeds) of the scandinavia :) Perhaps we need a toggle to use local repo or not. Or alternatively a way of replicating artefacts to other maven repositories so that it isnt slow from other countries.
Cheers
Dave ________________________________________ From: ala-portal-bounces@lists.gbif.org [ala-portal-bounces@lists.gbif.org] on behalf of Tim Robertson [GBIF] [trobertson@gbif.org] Sent: 03 June 2014 21:33 To: ala-portal@lists.gbif.org Subject: [Ala-portal] Ansible changes removes local caches - intended?
Hi ALA folks
With this commit: https://github.com/gbif/ala-install/commit/d5b6394f44aa82451079d84f646e200ca...
Things like the following were removed:
-- name: determine if the application is in our local repo - local_action: stat path={{local_repo_dir}}/{{biocache_service}}.war - register: biocache_war_path - -- name: fetch application if it is not in our local repo - local_action: get_url url={{biocache_service_url}} dest={{local_repo_dir}}/{{biocache_service}}.war - when: biocache_war_path.stat.exists == false
The original design was to pull the large artifacts to a local directory once (the .ala folder) so that if you were provisioning different environments (locally, uat, production etc) you weren’t troubled with huge download times for each environment.
Was this an oversight perhaps? In Europe we see 10s of minutes in this stage, hence the original design to cache locally.
If we get into proper maven releases and versioned artifacts deployed to nexus, we can start mirroring nexus repositories to bring back data locality too (plus it’ll use the local .m2 repo directory on the host).
Any thoughts on this? Would you like me to revert those changes?
Thanks, Tim
_______________________________________________ Ala-portal mailing list Ala-portal@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ala-portal
Thanks Dave - a good point I hadn’t considered. Let’s leave it for now then, but we might want to consider somehow getting the artifacts cached and replicated (nexus seems like a decent option for that).
Cheers, Tim
On 03 Jun 2014, at 13:50, David.Martin@csiro.au David.Martin@csiro.au wrote:
Hi Tim,
I made that change during a big refactor - but for selfish reasons. Use of local repo from my home internet connection to external VMs (EC2 and Nectar - an Aussie provider) made running the scripts from scratch too slow to use/test. Testing with vagrant is very useful, but up until point. I think Australia lacks the internet speeds (particularly upload speeds) of the scandinavia :) Perhaps we need a toggle to use local repo or not. Or alternatively a way of replicating artefacts to other maven repositories so that it isnt slow from other countries.
Cheers
Dave ________________________________________ From: ala-portal-bounces@lists.gbif.org [ala-portal-bounces@lists.gbif.org] on behalf of Tim Robertson [GBIF] [trobertson@gbif.org] Sent: 03 June 2014 21:33 To: ala-portal@lists.gbif.org Subject: [Ala-portal] Ansible changes removes local caches - intended?
Hi ALA folks
With this commit: https://github.com/gbif/ala-install/commit/d5b6394f44aa82451079d84f646e200ca...
Things like the following were removed:
-- name: determine if the application is in our local repo
- local_action: stat path={{local_repo_dir}}/{{biocache_service}}.war
- register: biocache_war_path
-- name: fetch application if it is not in our local repo
- local_action: get_url url={{biocache_service_url}} dest={{local_repo_dir}}/{{biocache_service}}.war
- when: biocache_war_path.stat.exists == false
The original design was to pull the large artifacts to a local directory once (the .ala folder) so that if you were provisioning different environments (locally, uat, production etc) you weren’t troubled with huge download times for each environment.
Was this an oversight perhaps? In Europe we see 10s of minutes in this stage, hence the original design to cache locally.
If we get into proper maven releases and versioned artifacts deployed to nexus, we can start mirroring nexus repositories to bring back data locality too (plus it’ll use the local .m2 repo directory on the host).
Any thoughts on this? Would you like me to revert those changes?
Thanks, Tim
Ala-portal mailing list Ala-portal@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ala-portal
---------------------------------------------------------------------------------------- Tim Robertson - GBIF Head of Informatics - trobertson@gbif.org Global Biodiversity Information Facility http://www.gbif.org/ GBIF Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark Tel: +45 3532 1487 Mob: +45 2826 1487 Fax: +45 2875 1480 ----------------------------------------------------------------------------------------
Hi Dave,
I can see src variables like biocache_hub_war_url pointing to a war file which makes the step essentially copying files over the internet. In that case, I wonder whether it’s possible to deposit the war file on any cloud storage service(GDrive, Dropbox or any) as part of your release workflow, and then modify the Ansible playbook to retrieve the war file directly from the storage service?
I find this copying step is most time-consuming and I believe putting it on the cloud would speed up a lot and facilitate the testing of the ALA setup. But I might have oversight here since I am not familiar with Java development workflow.
What do you think?
Cheers,
Burke
On 03 Jun 2014, at 14:00, Tim Robertson [GBIF] trobertson@gbif.org wrote:
Thanks Dave - a good point I hadn’t considered. Let’s leave it for now then, but we might want to consider somehow getting the artifacts cached and replicated (nexus seems like a decent option for that).
Cheers, Tim
On 03 Jun 2014, at 13:50, David.Martin@csiro.au David.Martin@csiro.au wrote:
Hi Tim,
I made that change during a big refactor - but for selfish reasons. Use of local repo from my home internet connection to external VMs (EC2 and Nectar - an Aussie provider) made running the scripts from scratch too slow to use/test. Testing with vagrant is very useful, but up until point. I think Australia lacks the internet speeds (particularly upload speeds) of the scandinavia :) Perhaps we need a toggle to use local repo or not. Or alternatively a way of replicating artefacts to other maven repositories so that it isnt slow from other countries.
Cheers
Dave ________________________________________ From: ala-portal-bounces@lists.gbif.org [ala-portal-bounces@lists.gbif.org] on behalf of Tim Robertson [GBIF] [trobertson@gbif.org] Sent: 03 June 2014 21:33 To: ala-portal@lists.gbif.org Subject: [Ala-portal] Ansible changes removes local caches - intended?
Hi ALA folks
With this commit: https://github.com/gbif/ala-install/commit/d5b6394f44aa82451079d84f646e200ca...
Things like the following were removed:
-- name: determine if the application is in our local repo
- local_action: stat path={{local_repo_dir}}/{{biocache_service}}.war
- register: biocache_war_path
-- name: fetch application if it is not in our local repo
- local_action: get_url url={{biocache_service_url}} dest={{local_repo_dir}}/{{biocache_service}}.war
- when: biocache_war_path.stat.exists == false
The original design was to pull the large artifacts to a local directory once (the .ala folder) so that if you were provisioning different environments (locally, uat, production etc) you weren’t troubled with huge download times for each environment.
Was this an oversight perhaps? In Europe we see 10s of minutes in this stage, hence the original design to cache locally.
If we get into proper maven releases and versioned artifacts deployed to nexus, we can start mirroring nexus repositories to bring back data locality too (plus it’ll use the local .m2 repo directory on the host).
Any thoughts on this? Would you like me to revert those changes?
Thanks, Tim
Ala-portal mailing list Ala-portal@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ala-portal
Tim Robertson - GBIF Head of Informatics - trobertson@gbif.org Global Biodiversity Information Facility http://www.gbif.org/ GBIF Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark Tel: +45 3532 1487 Mob: +45 2826 1487 Fax: +45 2875 1480
Ala-portal mailing list Ala-portal@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ala-portal
Thanks Burke
Its a good idea. We're looking into CDNs and replicating artefacts to the GBIF maven repository.
Dave ________________________________ From: Burke Chih-Jen Ko [GBIF] [bko@gbif.org] Sent: 09 June 2014 23:25 To: Martin, Dave (CES, Black Mountain) Cc: ala-portal@lists.gbif.org Subject: Re: [Ala-portal] Ansible changes removes local caches - intended?
Hi Dave,
I can see src variables like biocache_hub_war_url pointing to a war file which makes the step essentially copying files over the internet. In that case, I wonder whether it’s possible to deposit the war file on any cloud storage service(GDrive, Dropbox or any) as part of your release workflow, and then modify the Ansible playbook to retrieve the war file directly from the storage service?
I find this copying step is most time-consuming and I believe putting it on the cloud would speed up a lot and facilitate the testing of the ALA setup. But I might have oversight here since I am not familiar with Java development workflow.
What do you think?
Cheers,
Burke
On 03 Jun 2014, at 14:00, Tim Robertson [GBIF] <trobertson@gbif.orgmailto:trobertson@gbif.org> wrote:
Thanks Dave - a good point I hadn’t considered. Let’s leave it for now then, but we might want to consider somehow getting the artifacts cached and replicated (nexus seems like a decent option for that).
Cheers, Tim
On 03 Jun 2014, at 13:50, <David.Martin@csiro.aumailto:David.Martin@csiro.au> <David.Martin@csiro.aumailto:David.Martin@csiro.au> wrote:
Hi Tim,
I made that change during a big refactor - but for selfish reasons. Use of local repo from my home internet connection to external VMs (EC2 and Nectar - an Aussie provider) made running the scripts from scratch too slow to use/test. Testing with vagrant is very useful, but up until point. I think Australia lacks the internet speeds (particularly upload speeds) of the scandinavia :) Perhaps we need a toggle to use local repo or not. Or alternatively a way of replicating artefacts to other maven repositories so that it isnt slow from other countries.
Cheers
Dave ________________________________________ From: ala-portal-bounces@lists.gbif.orgmailto:ala-portal-bounces@lists.gbif.org [ala-portal-bounces@lists.gbif.orgmailto:ala-portal-bounces@lists.gbif.org] on behalf of Tim Robertson [GBIF] [trobertson@gbif.orgmailto:trobertson@gbif.org] Sent: 03 June 2014 21:33 To: ala-portal@lists.gbif.orgmailto:ala-portal@lists.gbif.org Subject: [Ala-portal] Ansible changes removes local caches - intended?
Hi ALA folks
With this commit: https://github.com/gbif/ala-install/commit/d5b6394f44aa82451079d84f646e200ca...
Things like the following were removed:
-- name: determine if the application is in our local repo - local_action: stat path={{local_repo_dir}}/{{biocache_service}}.war - register: biocache_war_path - -- name: fetch application if it is not in our local repo - local_action: get_url url={{biocache_service_url}} dest={{local_repo_dir}}/{{biocache_service}}.war - when: biocache_war_path.stat.exists == false
The original design was to pull the large artifacts to a local directory once (the .ala folder) so that if you were provisioning different environments (locally, uat, production etc) you weren’t troubled with huge download times for each environment.
Was this an oversight perhaps? In Europe we see 10s of minutes in this stage, hence the original design to cache locally.
If we get into proper maven releases and versioned artifacts deployed to nexus, we can start mirroring nexus repositories to bring back data locality too (plus it’ll use the local .m2 repo directory on the host).
Any thoughts on this? Would you like me to revert those changes?
Thanks, Tim
_______________________________________________ Ala-portal mailing list Ala-portal@lists.gbif.orgmailto:Ala-portal@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ala-portal
---------------------------------------------------------------------------------------- Tim Robertson - GBIF Head of Informatics - trobertson@gbif.orgmailto:trobertson@gbif.org Global Biodiversity Information Facility http://www.gbif.org/ GBIF Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark Tel: +45 3532 1487 Mob: +45 2826 1487 Fax: +45 2875 1480 ----------------------------------------------------------------------------------------
_______________________________________________ Ala-portal mailing list Ala-portal@lists.gbif.orgmailto:Ala-portal@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ala-portal
participants (3)
-
Burke Chih-Jen Ko [GBIF]
-
David.Martin@csiro.au
-
Tim Robertson [GBIF]