CHECK STATION LOCATION
CHECK IF TIDAL HEIGHT IS IN FEET OR METERS ->maybe do it “by hand” instead of package: https://lukemiller.org/index.php/2013/05/more-tide-prediction-with-r/

Goal of markdown: To filter out hourly median environmental water data (salinity, pH, and water temperature) during low tide when Fucus is exposed. I will do the opposite for air temperature (keeping data during low tide/when Fucus is exposed).

Approach:
- Read in hourly median environmental data and format datetime - Create a df with hourly tidal height (datetime and tide height column) - Align tidal data with environmental data - Remove/replace with NA water data when tidal height is <1m (when fucus is exposed) - Remove/ replace with NA air data when tidal height is >1m (when fucus is submerged)

Tidal height: Using tidal height from North Point, Pier 41, San Francisco, San Francisco Bay, California I’m using a height of 1m for when Fucus is exposed (need to follow up with Ryan about tidal height of fucus). I remove air temeperature data when the tidal height is greater than 1 m for now.

Links from searches:
https://tidesandcurrents.noaa.gov/map/index.shtml?lat=37.88157000000007&lng=-122.46661999999998&zoom=10
https://search.usa.gov/search?utf8=%E2%9C%93&affiliate=noaa.gov&query=tidal+height+data&commit=

Links from Karina
https://www.rdocumentation.org/packages/TideHarmonics/versions/0.1-1/topics/predict.tide
https://cran.r-project.org/web/packages/TideTables/TideTables.pdf
https://lukemiller.org/index.php/tag/tide-height/
https://rdrr.io/cran/oce/man/predict.tidem.html
https://lukemiller.org/index.php/2013/05/more-tide-prediction-with-r/

Set up

rm(list=ls())

library(tidyverse)
library(ggpubr)
library(scales)
library(chron)
library(plotly)
library(taRifx)
library(aweek)
library(easypackages)
## Error in get(genname, envir = envir) : object 'testthat_print' not found
library(renv)
library(here)
library(ggthemes)
library(gridExtra)
library(patchwork)
library(tidyquant)
library(recipes) 
library(cranlogs)
library(knitr)
library(openair)
library(data.table)

Tidal Height

References: https://lukemiller.org/index.php/2013/05/more-tide-prediction-with-r/
https://www.r-bloggers.com/2016/09/rtide-a-r-package-for-predicting-tide-heights-us-locations-only-currently/
Site names: http://www.flaterco.com/xtide/locations.html (Only the stations labeled ‘Ref’ will work)

Using package “rtide”

#package
library(rtide)
## Warning: package 'rtide' was built under R version 4.0.5
## rtide is not suitable for navigation
#checking station locations
tide_stations("San Francisco")
##  [1] "Alameda, San Francisco Bay, California"                            
##  [2] "Chevron Oil Company Pier, Richmond, San Francisco Bay, California" 
##  [3] "Coyote Creek, Alviso Slough, San Francisco Bay, California"        
##  [4] "Dumbarton Highway Bridge, San Francisco Bay, California"           
##  [5] "Hunters Point, San Francisco Bay, California"                      
##  [6] "North Point, Pier 41, San Francisco, San Francisco Bay, California"
##  [7] "Oyster Point Marina, San Francisco Bay, California"                
##  [8] "Redwood City, Wharf 5, San Francisco Bay, California"              
##  [9] "Redwood Creek Marker 8, San Francisco Bay, California"             
## [10] "Rincon Point, Pier 22<bd>, San Francisco Bay, California"          
## [11] "San Francisco, San Francisco Bay, California"                      
## [12] "San Leandro Marina, San Francisco Bay, California"                 
## [13] "San Mateo Bridge (west end), San Francisco Bay, California"
#the station I want to use is Point Chauncy SFB1309_1   Point Chauncey, 1.3 mi east of (depth 40 ft), San Francisco Bay, California Current Ref 37.8908° N  122.4188° W. Don't see it listed here so I may look into other programs.

#download timestep estimate in 1hr incriments so that I can align it with the hourly median air temeperature data --> didn't always line up so I downloaded it to the one minute resolution which is a lot and takes awhile to load but this way I can keep as much environmental data as possible 
tide = tide_height(station="North Point, Pier 41, San Francisco, San Francisco Bay, California",from = as.Date('2017-01-01'), 
        to = as.Date('2019-12-31'), minutes = 30, tz ='PST8PDT')

#view data
ggplot(data = tide, aes(x = DateTime, y = TideHeight)) + 
        geom_line() 

Aligning with environmental data

Environmental data- reading in hourly medians and format datetime column

#Salinity
cc.sal<-read.csv(
      "https://raw.githubusercontent.com/Cmwegener/thesis/master/data/environmental/hourly_median/cc_sal.csv",
    header = TRUE
  )
eos.sal<-read.csv(
      "https://raw.githubusercontent.com/Cmwegener/thesis/master/data/environmental/hourly_median/eos_sal.csv",
    header = TRUE
  )
rb.sal<-read.csv(
      "https://raw.githubusercontent.com/Cmwegener/thesis/master/data/environmental/hourly_median/rb_sal.csv",
    header = TRUE
  )
fp.sal<-read.csv(
      "https://raw.githubusercontent.com/Cmwegener/thesis/master/data/environmental/hourly_median/fp_sal.csv",
    header = TRUE
  )

#pH
cc.ph<-read.csv(
      "https://raw.githubusercontent.com/Cmwegener/thesis/master/data/environmental/hourly_median/cc_ph.csv",
    header = TRUE
  )
eos.ph<-read.csv(
      "https://raw.githubusercontent.com/Cmwegener/thesis/master/data/environmental/hourly_median/eos_ph.csv",
    header = TRUE
  )
rb.ph<-read.csv(
      "https://raw.githubusercontent.com/Cmwegener/thesis/master/data/environmental/hourly_median/rb_ph.csv",
    header = TRUE
  )
#no pH data at Fort Point

#water temperature
cc.wtemp<-read.csv(
      "https://raw.githubusercontent.com/Cmwegener/thesis/master/data/environmental/hourly_median/cc_watertemp.csv",
    header = TRUE
  )
eos.wtemp<-read.csv(
      "https://raw.githubusercontent.com/Cmwegener/thesis/master/data/environmental/hourly_median/eos_watertemp.csv",
    header = TRUE
  )
rb.wtemp<-read.csv(
      "https://raw.githubusercontent.com/Cmwegener/thesis/master/data/environmental/hourly_median/rb_watertemp.csv",
    header = TRUE
  )
fp.wtemp<-read.csv(
      "https://raw.githubusercontent.com/Cmwegener/thesis/master/data/environmental/hourly_median/fp_watertemp.csv",
    header = TRUE
  )

#air temperature
eos.air<-read.csv(
      "https://raw.githubusercontent.com/Cmwegener/thesis/master/data/environmental/hourly_median/eos_hourly_med_air.csv",
    header = TRUE
  )

gg.air<-read.csv(
      "https://raw.githubusercontent.com/Cmwegener/thesis/master/data/environmental/hourly_median/gg_hourly_med_air.csv",
    header = TRUE
  )

#format datetime column
#salinity
cc.sal$datetime<-as.POSIXct(cc.sal$datetime, format = c("%Y-%m-%d %H:%M:%S"))
eos.sal$datetime<-as.POSIXct(eos.sal$datetime, format = c("%Y-%m-%d %H:%M:%S"))
rb.sal$datetime<-as.POSIXct(rb.sal$datetime, format = c("%Y-%m-%d %H:%M:%S"))
fp.sal$datetime<-as.POSIXct(fp.sal$datetime, format = c("%Y-%m-%d %H:%M:%S"))

#ph
cc.ph$datetime<-as.POSIXct(cc.ph$datetime, format = c("%Y-%m-%d %H:%M:%S"))
eos.ph$datetime<-as.POSIXct(eos.ph$datetime, format = c("%Y-%m-%d %H:%M:%S"))
rb.ph$datetime<-as.POSIXct(rb.ph$datetime, format = c("%Y-%m-%d %H:%M:%S"))

#water temp
cc.wtemp$datetime<-as.POSIXct(cc.wtemp$datetime, format = c("%Y-%m-%d %H:%M:%S"))
eos.wtemp$datetime<-as.POSIXct(eos.wtemp$datetime, format = c("%Y-%m-%d %H:%M:%S"))
rb.wtemp$datetime<-as.POSIXct(rb.wtemp$datetime, format = c("%Y-%m-%d %H:%M:%S"))
fp.wtemp$datetime<-as.POSIXct(fp.wtemp$datetime, format = c("%Y-%m-%d %H:%M:%S"))

#air temp
eos.air$datetime<-as.POSIXct(eos.air$datetime, format = c("%Y-%m-%d %H:%M:%S"))
gg.air$datetime<-as.POSIXct(gg.air$datetime, format = c("%Y-%m-%d %H:%M:%S"))

Merge and align hourly median df and tide by datetime column

#need same column name to merge
names(tide)[2] <- "datetime"

#rounding hourly medians to the nearest 5 minute to remove "seconds" and to allow all the environmental data to be aligned with a tidal height (which was downloaded every 30 min). At most this will change the environmental data time by 2.5 minutes which is less than the frequency the instruments recorded them at (6 or 15 min)
cc.sal$datetime<-round_date(cc.sal$datetime, unit="5 minute")
eos.sal$datetime<-round_date(eos.sal$datetime, unit="5 minute")
rb.sal$datetime<-round_date(rb.sal$datetime, unit="5 minute")
fp.sal$datetime<-round_date(fp.sal$datetime, unit="5 minute")

cc.ph$datetime<-round_date(cc.ph$datetime, unit="5 minute")
eos.ph$datetime<-round_date(eos.ph$datetime, unit="5 minute")
rb.ph$datetime<-round_date(rb.ph$datetime, unit="5 minute")

cc.wtemp$datetime<-round_date(cc.wtemp$datetime, unit="5 minute")
eos.wtemp$datetime<-round_date(eos.wtemp$datetime, unit="5 minute")
rb.wtemp$datetime<-round_date(rb.wtemp$datetime, unit="5 minute")
fp.wtemp$datetime<-round_date(fp.wtemp$datetime, unit="5 minute")

eos.air$datetime<-round_date(eos.air$datetime, unit="5 minute")
gg.air$datetime<-round_date(gg.air$datetime, unit="5 minute")

#Merge
#salinity
cc.sal.tide<-merge(cc.sal, tide, by="datetime", all=TRUE)
eos.sal.tide<-merge(eos.sal, tide, by="datetime", all=TRUE)
rb.sal.tide<-merge(rb.sal, tide, by="datetime", all=TRUE)
fp.sal.tide<-merge(fp.sal, tide, by="datetime", all=TRUE)

#ph
cc.ph.tide<-merge(cc.ph, tide, by="datetime", all=TRUE)
eos.ph.tide<-merge(eos.ph, tide, by="datetime", all=TRUE)
rb.ph.tide<-merge(rb.ph, tide, by="datetime", all=TRUE)

#water temp
cc.wtemp.tide<-merge(cc.wtemp, tide, by="datetime", all=TRUE)
eos.wtemp.tide<-merge(eos.wtemp, tide, by="datetime", all=TRUE)
rb.wtemp.tide<-merge(rb.wtemp, tide, by="datetime", all=TRUE)
fp.wtemp.tide<-merge(fp.wtemp, tide, by="datetime", all=TRUE)

#air temp
eos.air.tide<-merge(eos.air, tide, by="datetime", all=TRUE)
gg.air.tide<-merge(gg.air, tide, by="datetime", all=TRUE)

#for some reason there's 8 NA values now at the beginning of the data set for tidal height but this isn't during my survey dates so it should be ok but still odd

Filter out based on tide height (in meters). Still not sure what height to do/at what height Fucus is exposed at so I’m using 1m as the cut off for now #instead of removing, I’m going to replace TideHeight >1 with NA

#remove when exposed, <1m
#salinity
cc.sal.tide$salinity[cc.sal.tide$TideHeight < 1] <- NA
eos.sal.tide$salinity[eos.sal.tide$TideHeight < 1] <- NA
rb.sal.tide$salinity[rb.sal.tide$TideHeight < 1] <- NA
fp.sal.tide$salinity[fp.sal.tide$TideHeight < 1] <- NA

#ph
cc.ph.tide$ph[cc.ph.tide$TideHeight < 1] <- NA
eos.ph.tide$ph[eos.ph.tide$TideHeight < 1] <- NA
rb.ph.tide$ph[rb.ph.tide$TideHeight < 1] <- NA

#water temp
cc.wtemp.tide$water_temp[cc.wtemp.tide$TideHeight < 1] <- NA
eos.wtemp.tide$water_temp[eos.wtemp.tide$TideHeight < 1] <- NA
rb.wtemp.tide$water_temp[rb.wtemp.tide$TideHeight < 1] <- NA
fp.wtemp.tide$water_temp[fp.wtemp.tide$TideHeight < 1] <- NA


#remove when submerged, >1m
#air temp
eos.air.tide$air_temperature[eos.air.tide$TideHeight > 1] <- NA
gg.air.tide$air_temp[gg.air.tide$TideHeight > 1] <- NA

These are now very large files because tidal height data was dowloaded every 30 minutes for two years. I’m going to remove rows with “NA” in the environmental data column. Usually I prefer to keep rows/have holders with “NA” but since the analysis I’m doing depends on the actual date and not number of rows it’s okay in this case.

#removing rows with NA in environmental variable column
cc.sal.tide<-cc.sal.tide[!is.na(cc.sal.tide$salinity),]
eos.sal.tide<-eos.sal.tide[!is.na(eos.sal.tide$salinity),]
rb.sal.tide<-rb.sal.tide[!is.na(rb.sal.tide$salinity),]
fp.sal.tide<-fp.sal.tide[!is.na(fp.sal.tide$salinity),]

cc.ph.tide<-cc.ph.tide[!is.na(cc.ph.tide$ph),]
eos.ph.tide<-eos.ph.tide[!is.na(eos.ph.tide$ph),]
rb.ph.tide<-rb.ph.tide[!is.na(rb.ph.tide$ph),]

cc.wtemp.tide<-cc.wtemp.tide[!is.na(cc.wtemp.tide$water_temp),]
eos.wtemp.tide<-eos.wtemp.tide[!is.na(eos.wtemp.tide$water_temp),]
rb.wtemp.tide<-rb.wtemp.tide[!is.na(rb.wtemp.tide$water_temp),]
fp.wtemp.tide<-fp.wtemp.tide[!is.na(fp.wtemp.tide$water_temp),]

eos.air.tide<-eos.air.tide[!is.na(eos.air.tide$air_temperature),]
gg.air.tide<-gg.air.tide[!is.na(gg.air.tide$air_temperature),]

Now I have water data (salinity, ph, and water temp) when fucus is submerges and air temperature only when Fucus is exposed (less than 1m, may need to change). Next is to link it with field data. Going to save each of these data frames and will link them with field data in another markdown.

Save as csv

write.csv(cc.sal.tide, "C:/Users/chels/Box Sync/Thesis/Data/Working data/Bouy data/cc.sal.tide.csv")
write.csv(eos.sal.tide, "C:/Users/chels/Box Sync/Thesis/Data/Working data/Bouy data/eos.sal.tide.csv")
write.csv(rb.sal.tide, "C:/Users/chels/Box Sync/Thesis/Data/Working data/Bouy data/rb.sal.tide.csv")
write.csv(fp.sal.tide, "C:/Users/chels/Box Sync/Thesis/Data/Working data/Bouy data/fp.sal.tide.csv")

write.csv(cc.ph.tide, "C:/Users/chels/Box Sync/Thesis/Data/Working data/Bouy data/cc.ph.tide.csv")
write.csv(eos.ph.tide, "C:/Users/chels/Box Sync/Thesis/Data/Working data/Bouy data/eos.ph.tide.csv")
write.csv(rb.ph.tide, "C:/Users/chels/Box Sync/Thesis/Data/Working data/Bouy data/rb.ph.tide.csv")

write.csv(cc.wtemp.tide, "C:/Users/chels/Box Sync/Thesis/Data/Working data/Bouy data/cc.wtemp.tide.csv")
write.csv(eos.wtemp.tide, "C:/Users/chels/Box Sync/Thesis/Data/Working data/Bouy data/eos.wtemp.tide.csv")
write.csv(rb.wtemp.tide, "C:/Users/chels/Box Sync/Thesis/Data/Working data/Bouy data/rb.wtemp.tide.csv")
write.csv(fp.wtemp.tide, "C:/Users/chels/Box Sync/Thesis/Data/Working data/Bouy data/fp.wtemp.tide.csv")

write.csv(eos.air.tide, "C:/Users/chels/Box Sync/Thesis/Data/Working data/Bouy data/eos.air.tide.csv")
write.csv(gg.air.tide, "C:/Users/chels/Box Sync/Thesis/Data/Working data/Bouy data/gg.air.tide.csv")