The research investigates the influence of human expertise on the effectiveness of ice management operations. The key contribution is an experimental method for investigating human factor issues in an operational setting. Ice management is defined as a systematic operation that enables a marine operation to proceed safely in the presence of sea ice. In this study, the effectiveness of ice management operations was assessed in terms of ability to modify the presence of pack ice around an offshore structure. This was accomplished in a full-mission marine simulator as the venue for a systematic investigation. In the simulator, volunteer participants from a range of seafaring experience levels were tasked with individually completing ice management tasks. Recorded from 36 individuals' simulations, we compared ice management effectiveness metrics against two independent variables: (i) experience level of the participant, categorized as either cadet or seafarer and (ii) ice severity, measured in ice concentration. The results showed a significant difference in ice management effectiveness between experience categories. We examined what the seafarers did that made them more effective and characterized their operational tactics. The research provides insight into the relative importance of vessel operator skills in contributing to effective ice management, as well as how this relative importance changes as ice conditions vary from mild to severe. This may have implications for training in the nautical sciences and could help to inform good practices in ice management.
Introduction
The human factor of expertise is critically important to marine operations like ice management that rely heavily on the knowledge, competence, and proficiency of personnel. Despite its importance, though, expertise is rarely included in engineering assessments, mainly due to the difficulty it poses. In this research, systematic investigation of the human factor of experience in ice management is made possible with the use of a full-mission marine simulator.
The influence of two factors on the effectiveness of ice management operations was studied: (i) bridge officer experience and (ii) ice severity. It was hypothesized that ice management would be more effective (that is, more ice would be cleared in a given amount of time) with more experienced bridge officers compared to novice ones. Furthermore, it was hypothesized that this experience effect would be stronger in more severe ice conditions (higher ice concentrations).
The context of the research question is important for marine industry operators in areas where sea ice and glacial ice must be managed to enable operations to proceed safely [1,2]. This research investigates the case of drifting broken sea ice when it enters an area in which a moored, floating installation is present. Such operations have been documented for many regions, including the Arctic Ocean [3], the Okhotsk Sea [4], and the Beaufort Sea [5], to cite a few cases. An experimental campaign was designed to test the link between the human factor of bridge crew experience and the environmental factor of ice concentration on the effectiveness of a defined ice management operation.
Methods
Simulator.
The ice management simulator used in the experiment was designed and built for research. It uses PhysX software [6]. The simulator consists of a bridge console positioned in the middle of a 360 deg panoramic projection screen. The bridge console is a (2 m × 2 m) platform mounted on a Moog motion bed. For this experiment, the motion bed was turned off. A schematic of the simulator is shown in Fig. 1.
The simulated vessel used in the ice management scenarios was based on Anchor handling tug supply (AHTS) vessels typical of those used in the Newfoundland offshore area. It has a length overall of 75 m and is powered by twin 5369 kW diesel engines. For propulsion, it has two controllable pitch (CP) propellers and rudders, and forward and aft tunnel thrusters, each with 895 kW of power. The simulator bridge consisted of a simplified forward console and aft console. To switch between consoles, the driver had to turn to the opposing console and transfer controls using “Transfer” toggle switches. Both consoles had basic controls: main propellers (port and starboard), steering, and tunnel thrusters (fore and aft). A schematic of the forward console is shown in Fig. 2.
The bridge console was highly simplified and did not have navigational components like radar, global positioning system, or chart systems. Moreover, the simulated version of the AHTS was not exact in hydrodynamic likeness, particularly with regard to its seakeeping and maneuvering characteristics. Notwithstanding these limitations in similitude, the simulator was good enough for this experiment whose purpose was to detect differences between bridge officer experience groups and characterize general principles of good ice management practices.
All participants in the experiment were given 60 min to familiarize themselves with the controls and maneuvering characteristics prior to completing the ice management scenarios. This was accomplished in three basic 20 min-long scenarios designed to habituate participants to the simulation environment. None of the participants had used the simulator before. Signs and symptoms of simulator-induced sickness were monitored before and after each exposure period to the simulator using a self-reported questionnaire [7]. No participants noted simulator sickness symptoms severe enough to justify stopping a simulation trial.
The Instructor Station was located several meters outside the periphery of the projection screen, out of view from inside the simulator. This is where the experimenters started and stopped scenarios and provided scripted instructions to the bridge officer inside the simulator. Instructions were communicated with a two-way very high frequency radio. Distances from the “own-ship” (the vessel being operated in the simulator) to specified targets could also be communicated in this manner, whenever they were requested. A screenshot taken from the Instructor Station monitor is shown in Fig. 3.

Screenshot from the Instructor Station monitor during simulation. Graphics shown here are identical to those that appear in Replay files, which were used for data analysis.
Data acquisition was handled by five dedicated processing computers. Zonal concentrations, time, latitude and longitude position, speed, and heading were recorded. A video “Replay” file was saved upon completion, which upon playback showed the 30 min simulation from start to finish. The Replay file imagery appeared as shown in Fig. 3.
Design of Experiments.
The approach adopted a formal design of experiments [8]: a 2k full factorial was completed with nine replicates and k = 2 factors, totaling 36 runs. The first factor, ice severity, was represented by ice concentration and could be changed in the parameter settings of the simulator. The low-level treatment was set to four-tenths ice concentration; the high-level treatment was set to seven-tenths ice concentration. The second factor, experience level of bridge officers, was represented by a high- and low-level categorical variable. The low-level experience category consisted of 18 cadets enrolled in a local seafaring program (years spent at sea = 0–3). This group included six students from each of the first, second, and fourth year classes. The high-level experience category consisted of eighteen seafarers (masters and mates) employed in the marine industry (average years spent at sea = 20 ± 10 years). This group included operators of coastal ferries, bulk carriers, cargo tankers, offshore supply vessels, and AHTSs, with the latter two subset groups contributing the highest number of participants. Participants were recruited on a volunteer basis. Following the research protocol approved by Memorial University's interdisciplinary committee on ethics in human research, all volunteers provided their informed consent before participating in the experiment.
Due to the logistical challenges of scheduling thirty-six voluntary research participants, it was impractical to run a completely random design (CRD). In standard experiment designs, the order of individual trials is determined randomly—a method that usually ensures that observations are independent, thereby complying with statistical methods. A way to circumvent the problem of restricted randomization is the split-plot design [9,10]. In this design, the hard-to-change experience variable stayed constant in pairs of consecutive runs (the whole-plot) while the easy-to-change concentration variable was randomly selected for each group (the subplot). The experimental error of effects estimates could thereafter be estimated separately, thereby balancing the biasing effect of uncontrollable experimental conditions that may have otherwise been undetectable.
Several studies have reported how ice loads on structures might be reduced through ice management measures [11–15]. Having just two independent variables in the experiment (ice concentration and experience level) and one dependent variable (concentration reduction), the aim was to use straightforward scenarios in order to avoid introducing confounding factors. Still, the scenario had to be a realistic representation of an ice management exercise. Each of the 36 participants completed two different ice management scenarios to reflect different ice management tasks. The first scenario was called “precautionary” ice management, in which the participant was tasked with keeping the area around a moored floating, production, storage, and offloading vessel (FPSO) clear of ice. The second scenario was called “emergency” ice management, in which the participant was tasked with clearing away ice from an area underneath one of the FPSO's lifeboat launch zones in preparation for evacuation. The areas the participants were tasked with clearing are outlined in Figs. 4 and 5. Other than FPSO heading (0 deg for the precautionary and 23 deg for the emergency scenario) and ice concentration (low level 4-tenths and high level 7-tenths), all other conditions in the scenarios were equivalent across trials. The FPSO maintained position and did not surge, yaw, or sway during simulation scenarios. As there were two ice concentration levels and two different ice management scenarios, there were four different scenarios used in total in this experiment. (The two 7-tenths concentration cases are shown in Figs. 4 and 5.) Each participant completed both scenarios at the same ice concentration level. All scenarios were 30 min long. The floe sizes were randomly sampled from a lognormal distribution. The floe thickness was uniformly set to 40 cm.
Analysis.
Two performance metrics were used to assess ice management effectiveness: (i) average ice clearing (tenths concentration) and (ii) cumulative ice-free lifeboat launch time (minutes). The latter of the two metrics applied to the Emergency scenario only.
The analysis of the two metrics was conducted in a systematic way and was grounded in basic statistical principles. First, data were explored in a descriptive and visual sense. This was followed by a more rigorous approach whose aim was to determine the extent to which ice concentration and bridge officer experience influenced the effectiveness of the simulated ice management scenarios. The main effects of each factor and their interaction effect were determined in the same way as in a regular factorial design. The key difference is that a half-normal plot of effects was used to screen for significant effects for whole-plot and subplot groups separately [16]. Then analysis of variance (ANOVA) was used to check for significant effects and account for the group terms separately [17]. Significant effects from both groups were then combined to give the final model.
Residual plots were used to check modeling assumptions [18]. Diagnostic checks of modeling adequacy showed that assumptions of normally distributed residuals, heteroscedasticity, and independence of residuals with run order were valid. This check was performed for all analyses in this work because it showed that modeling assumptions were valid, thereby supporting any inferences on which they were based.
Additional analysis helped to answer the question of what characterized good ice management practices for particularly strong performers. For a chosen scenario and for a chosen performance metric, we looked at three data sources: (i) plots of position during simulation, (ii) screenshots captured from Replay files, and (iii) exit interviews. This combination of qualitative and quantitate data enabled us to present a general description of the most effective ice management strategy for the emergency scenario.
Results
An example is given of a single test to illustrate what was measured. The example is followed by a more detailed discussion about general results.
Results from a single test are shown in a plot in Fig. 6. It shows the concentration measurement taken at a rate of once every 30 s during a 30 min emergency ice management scenario. The concentration measurements were recorded from the box area under the port lifeboat launch zone shown in Fig. 5. This is the area that participants in the experiment were tasked with clearing. Also shown in the plot in Fig. 6 is the baseline unmanaged ice concentration within this zone; that is, the ice concentration that occurs within the box area when no ice management is performed.

Example measurements from a single 30-minute simulation trial. “unmanaged ice” refers to recorded ice concentration values within the ice management zone when no ice management is performed; “managed ice” refers to that when ice management is performed. Zonal ice concentration values are recorded at 30 s intervals.

Example measurements from a single 30-minute simulation trial. “unmanaged ice” refers to recorded ice concentration values within the ice management zone when no ice management is performed; “managed ice” refers to that when ice management is performed. Zonal ice concentration values are recorded at 30 s intervals.
This example case was selected randomly from the 72 tests that were done. The driver in this case was from the seafarer group (high-level experience), with 16 years of experience at sea. The participant had performed ice management operations within the past three years and had experienced between 3 and 10 seasons in ice over their career.
From Fig. 6, it is clear that the baseline unmanaged ice concentration within the measurement zone is not steady. In fact, for the emergency ice management scenario, the ice accumulated along the side of the FPSO as it drifted past such that the ice concentration started at 7-tenths and rose to almost 9-tenths. Therefore, it is more helpful to analyze the relative drop in ice concentration compared to this baseline measure, rather than the recorded concentration in the box area at a given instant. From the computed concentration drop relative to the baseline, average clearing (in terms of ice concentration) can be derived. The reduction in ice concentration and the corresponding average clearing metric is shown in Fig. 7 for the example case.

The example case showing the average clearing performance metric. This metric is derived by subtracting the managed ice zonal concentration values from the baseline unmanaged ice values recorded during the 30-minute simulation.
Average Clearing.
Having examined a single case in Figs. 6 and 7, we now look at the entire sample of 36 participants tasked with two 30-min scenarios each (72 total simulator trials and 36 h of total simulator time) to characterize the data in terms of the chosen performance metrics. Table 1 shows descriptive statistics for the average clearing metric for the cadet (low-level experience) group for precautionary and emergency ice management scenarios. Similarly, Table 2 shows descriptive statistics for the average clearing metric for the seafarer (high-level experience) group for precautionary and emergency ice management scenarios.
Descriptive statistics for cadets' average clearing (tenths concentration)
Scenario | Concentration | Mean | Standard deviation | Minimum | Median | Maximum |
---|---|---|---|---|---|---|
Precautionary | 4 | 0.4 | 0.3 | 0.0 | 0.4 | 1.0 |
7 | 1.0 | 0.6 | 0.4 | 0.9 | 2.1 | |
Emergency | 4 | 1.0 | 0.6 | 0.1 | 1.0 | 2.1 |
7 | 1.7 | 0.8 | 0.3 | 1.8 | 2.9 |
Scenario | Concentration | Mean | Standard deviation | Minimum | Median | Maximum |
---|---|---|---|---|---|---|
Precautionary | 4 | 0.4 | 0.3 | 0.0 | 0.4 | 1.0 |
7 | 1.0 | 0.6 | 0.4 | 0.9 | 2.1 | |
Emergency | 4 | 1.0 | 0.6 | 0.1 | 1.0 | 2.1 |
7 | 1.7 | 0.8 | 0.3 | 1.8 | 2.9 |
Descriptive statistics for seafarers' average clearing (tenths concentration)
Scenario | Concentration | Mean | Standard deviation | Minimum | Median | Maximum |
---|---|---|---|---|---|---|
Precautionary | 4 | 0.3 | 0.4 | −0.1 | 0.2 | 1.0 |
7 | 1.2 | 0.7 | 0.2 | 1.0 | 2.5 | |
Emergency | 4 | 1.6 | 0.5 | 0.8 | 1.7 | 2.1 |
7 | 2.6 | 0.7 | 0.9 | 2.9 | 3.4 |
Scenario | Concentration | Mean | Standard deviation | Minimum | Median | Maximum |
---|---|---|---|---|---|---|
Precautionary | 4 | 0.3 | 0.4 | −0.1 | 0.2 | 1.0 |
7 | 1.2 | 0.7 | 0.2 | 1.0 | 2.5 | |
Emergency | 4 | 1.6 | 0.5 | 0.8 | 1.7 | 2.1 |
7 | 2.6 | 0.7 | 0.9 | 2.9 | 3.4 |
From Tables 1 and 2, some characterizations can be made about the relative performance of the two groups across like scenarios and concentration levels. For instance, there is little difference between groups when comparing across the precautionary ice management scenario. The emergency ice management scenario, on the other hand, appears more suited to detect differences between groups for the average clearing metric. Another important trend to note is that clearing is consistently higher at the higher concentration level.
These trends are visually represented in the boxplots in Fig. 8, which are grouped by concentration and experience for all trials of the emergency ice management scenario.
From Fig. 8, some additional characterizations can be made about the data. For instance, the spread in the data is clearly smaller in the seafarer group. This is an indication that seafarers are more consistent in their performance. In the cadet group, data appear to be distributed evenly around their central tendencies such that there is no obvious positive or negative skew in the data, whereas in the seafarer group there is a slight positive skew. A single outlier appears in the seafarer group (defined as a value outside the range of 1.5 times the first and third quartiles).
Half-normal plots of subplot effects and whole-plot effects are shown in Figs. 9 and 10, respectively. Half-normal plots can be used to select significant factors for the analysis. Here, concentration (easy-to-change whole-plot factor) appears significant in the subplot effects and experience (hard-to-change subplot factor) appears significant in the whole-plot effects. The interaction effect is plotted on the whole-plot half-normal plot and it does not appear significant, so it is dropped from the analysis. Restricted maximum likelihood (REML) ANOVA (Table 3) is computed to formally test for significant effects. It confirms that the experience and concentration factors are significant effects, with p-values less than that prescribed by the acceptable type 1 error rate of α = 5%.

Half-normal plot of subplot effect. Squares indicate positive effects; triangles indicate error estimates. This plot shows that concentration has a large positive effect relative to the “error line” representing the smallest 50% of effects.

Half-normal plot of whole-plot effect. Squares indicate positive effects; triangles indicate error estimates. This plot shows that experience has a large positive effect relative to the error line representing the smallest 50% of effects.
REML ANOVA table (average clearing)
REML analysis for selected model (Kenward–Roger p-values) | ||||
---|---|---|---|---|
Fixed effects (Type Ill) | Term | Error | p-value | |
Source | df | df | F | Prob > F |
Whole-plot | 1 | 32.00 | 10.64 | 0.0026 |
a-Experience | 1 | 32.00 | 10.64 | 0.0026 |
Subplot | 2 | 32.00 | 7.34 | 0.0024 |
B-Concentration | 1 | 32.00 | 14.25 | 0.0007 |
aB | 1 | 32.00 | 0.43 | 0.5165 |
REML analysis for selected model (Kenward–Roger p-values) | ||||
---|---|---|---|---|
Fixed effects (Type Ill) | Term | Error | p-value | |
Source | df | df | F | Prob > F |
Whole-plot | 1 | 32.00 | 10.64 | 0.0026 |
a-Experience | 1 | 32.00 | 10.64 | 0.0026 |
Subplot | 2 | 32.00 | 7.34 | 0.0024 |
B-Concentration | 1 | 32.00 | 14.25 | 0.0007 |
aB | 1 | 32.00 | 0.43 | 0.5165 |
Note: This table shows that experience and ice concentration variables both have a significant effect on the average clearing response.
The finding that concentration is significant is not surprising because the factor is at least partly collinear with the response, average ice clearing. What is of interest is the interaction effect occurring at different treatment combinations. As the experience-concentration interaction effect is not statistically significant, we can reject the hypothesis that ice management effectiveness increases with increasing experience and ice concentration.
Note that when analyzing the results as if they came from a CRD using standard ANOVA, the results yield the same conclusions. This is explained by the REML variance component estimates (Table 4), which show a group variance of zero, indicating that the whole-plot model is explaining all the variation between groups. The analysis is, in other words, equivalent to a randomized design. Despite this, experimenters running similarly designed experiments should take the same precaution and should not analyze results as if they came from a CRD. This finding applies to all metrics analyzed in this work.
REML variance components table. As the group variance is zero, it shows that the split-plot design was equivalent to a CRD.
Variance components | ||||
---|---|---|---|---|
Source | Variance | StdErr | 95% CI Low | 95% CI High |
Group | 0.000 | 0.000 | 0.000 | 0.000 |
Residual | 0.44 | 0.11 | 0.29 | 0.76 |
Total | 0.44 |
Variance components | ||||
---|---|---|---|---|
Source | Variance | StdErr | 95% CI Low | 95% CI High |
Group | 0.000 | 0.000 | 0.000 | 0.000 |
Residual | 0.44 | 0.11 | 0.29 | 0.76 |
Total | 0.44 |
Key results of the model are shown in the effects plot in Fig. 11. The plot shows the average clearing metric for all 36 participant trials of the emergency ice management scenario. Data are summarized with Fisher's least significant difference I-bars around predictions at each treatment combination. This is an approximate way to check whether predicted means at displayed factor combinations are significantly different. As the slopes of the lines are almost parallel, it is clear there is no interaction effect between the two factors on the average drop in managed ice concentration.

Interaction plot of concentration and experience on average ice clearing. As the two lines are approximately parallel, it shows that the effect of experience (the “gap” between the lines) does not increase at a higher concentration level.
Residual plots are presented here to show that the experimenter has checked underlying assumptions required for the data analysis. The normal probability plot (Fig. 12) checks the assumption required in ANOVA that residuals are normally distributed. From Fig. 12, it is clear that residuals follow approximately a straight line on the plot with no definite patterns; as such, the assumption that residuals are normally distributed holds. The residuals versus predicted plot (Fig. 13) is another diagnostic tool for visually checking modeling assumptions. It plots residuals versus predicted response values. When the plot shows random scatter, it indicates that variance is not related to the size of the response. This constant variance is called heteroscedasticity and is another assumption required by ANOVA. It follows from Fig. 13 that the assumption that residuals are heteroscedastic is acceptable.

Normal plot of residuals. Since the residuals of the measured responses follow approximately a straight line, the underlying assumption required by ANOVA that residuals be normally distributed is verified.

Residuals versus ascending predicted response values. The plot shows random scatter, thereby verifying an important underlying assumption in ANOVA that variance be constant.
Figure 14 shows a plot of residuals versus run order. The random scatter indicates that no time-related variables that went unaccounted for in the analysis are influencing results.

Residuals versus run order. There appears to be no relationship between run order and residuals of measured responses, a critical check that verifies that no time-related lurking variables have affected results.
These important diagnostic checks are performed for all analyses of performance metrics in this work. Note that in the interest of abridging this work, diagnostics plots are shown only for the average clearing metric. The experimental results and analysis are presented in full in Ref. [20].
Ice-Free Lifeboat Launch Time.
During the emergency scenario, each participant was told “to clear ice from underneath the port lifeboat launch zone.” The lifeboat was visible near the port quarter of the FPSO, and participants had remarked in exit interviews that this visual aid had helped guide them to the location in which clearing was required. Although it was not the original intention, it followed that the cumulative ice-free lifeboat launch time, measured in minutes, would be a good metric of performance for this scenario.
To setup an appropriate analysis, the size and location of the lifeboat launch zone had to be specified. The lifeboat drop zone radius was set at 8 m, based on the size required to accommodate an 80-person capacity totally enclosed motor propelled survival craft (TEMPSC) typical of those found on FPSOs. These lifeboats have dimensions of approximately 10 m in length and 3.7 m in breadth. Results of experiments by Simões Ré et al. found that a target drop point radius of approximately 1.5 m accommodated launches for a TEMPSC from an offshore installation [19]. The 8 m “splash-zone” radius that was set to circumscribe this target area would conservatively encompass offsets. These offset distances might occur due to missed target points and setbacks by first wave encounters. The origin of the zone was set at 8 m off the side of the port quarter of the FPSO, so that the zone was tangent to the side of the FPSO hull. A schematic showing the lifeboat launch zone is presented in Fig. 15.

Port lifeboat launch zone (not to scale). The circle represents an 8 m radius splash zone based on the target drop point of a 10 m TEMPSC lifeboat launch.
The cumulative time that the lifeboat splash zone was ice-free was computed using matlab image processing software. Successive Replay files, captured at 30 s intervals during simulation, were cropped to the shape and size of the lifeboat splash zone. From here, a pixel count of the raster image was computed; if the image had more colored pixels than that of a cropped blank image, it meant that ice (or the own-ship itself in rare instances) was in the lifeboat zone. For each successive ice-free image, a 30 s time increment was added to the total time the zone was ice-free. The resulting total cumulative time was therefore an approximation. Given that at a rate of current drift of 0.5 knots ice would drift no more than 8 m in 30 s, coinciding with the radius of the lifeboat splash zone, this approximate cumulative ice-free time estimate was considered appropriate for this study.
We can look at the sample of 36 participants tasked with the emergency scenario to characterize the data in terms of the ice-free lifeboat launch zone performance metric. This metric is measured in minutes of cumulative time that no ice is present in the 8 m radius lifeboat launch zone. Table 5 shows descriptive statistics for the cumulative ice-free time metric for the cadet (low-experience) group for the emergency ice management scenario. Table 6 shows the same statistics for the seafarer (high-experience) group.
Descriptive statistics for cadets' cumulative ice-free lifeboat launch times (minutes)
Standard | ||||||
---|---|---|---|---|---|---|
Scenario | Concentration | Mean | deviation | Minimum | Median | Maximum |
Emergency | 4 | 7.3 | 4.4 | 0.0 | 6.5 | 16.5 |
7 | 4.5 | 4.1 | 0.0 | 4.0 | 11.5 |
Standard | ||||||
---|---|---|---|---|---|---|
Scenario | Concentration | Mean | deviation | Minimum | Median | Maximum |
Emergency | 4 | 7.3 | 4.4 | 0.0 | 6.5 | 16.5 |
7 | 4.5 | 4.1 | 0.0 | 4.0 | 11.5 |
Descriptive statistics for seafarers' cumulative ice-free lifeboat launch times (minutes)
Standard | ||||||
---|---|---|---|---|---|---|
Scenario | Concentration | Mean | deviation | Minimum | Median | Maximum |
Emergency | 4 | 10.0 | 3.2 | 7.0 | 8.5 | 16.5 |
7 | 11.6 | 7.0 | 0.0 | 13.0 | 19.0 |
Standard | ||||||
---|---|---|---|---|---|---|
Scenario | Concentration | Mean | deviation | Minimum | Median | Maximum |
Emergency | 4 | 10.0 | 3.2 | 7.0 | 8.5 | 16.5 |
7 | 11.6 | 7.0 | 0.0 | 13.0 | 19.0 |
From Tables 5 and 6, there is a clear difference between experience groups. The mean cumulative time the lifeboat zone was ice free is consistently higher for the seafarers. This trend is particularly striking at the high concentration treatment (7-tenths), in which seafarers kept the lifeboat zone ice-free more than twice as long, on average, than cadets.
The boxplots in Fig. 16 help to visualize the data. They are grouped by concentration and experience for all trials of the emergency ice management scenario.
From Fig. 16, some additional characterizations can be made about the data. The overall difference between groups in terms of cumulative ice-free lifeboat launch times is significant. For example, the median values for the seafarer group at both concentration treatments are higher than even the respective third quartiles recorded for the cadet group. The spread is lower for the seafarers in the low concentration treatment (4-tenth), but higher in the high-concentration treatment (7-tenths). Despite the better performance overall, the high spread may indicate a relative higher degree of expertise variability within the seafarer group. Also, it is remarkable that responses are higher for the 7-tenths treatment in the seafarer group. The response in this case is independent of starting ice concentration and the high-level treatment (7-tenths) was expected to represent a significant challenge compared to the low treatment (level 4-tenths). This surprising result cannot be explained by experience difference within the seafarer group, either, as experience level for both treatments were similar (4-tenths treatment: seafarers' average years at sea = 20 ± 9; 7-tenths treatment: seafarers' average years at sea = 20 ± 10).
Analysis of variance shows that experience is a significant factor on the cumulative ice-free lifeboat launch time, with a p-value higher than that prescribed by the acceptable type 1 error rate of α = 5% (Table 7).
REML ANOVA (cumulative ice-free lifeboat launch time)
REML analysis for selected model Kenward-Roger p-values | ||||
---|---|---|---|---|
Fixed effects [Type 1111] | Term | Error | p-value | |
Source | df | df | F | Prob > F |
Whole-plot | 1 | 32.00 | 6.84 | 0.0135 |
a-Experience | 1 | 32.00 | 6.84 | 0.0135 |
Subplot | 2 | 32.00 | 1.03 | 0.3683 |
8-Concentration | 1 | 32.00 | 1.10 | 0.3025 |
aB | 1 | 32.00 | 0.96 | 0.3338 |
REML analysis for selected model Kenward-Roger p-values | ||||
---|---|---|---|---|
Fixed effects [Type 1111] | Term | Error | p-value | |
Source | df | df | F | Prob > F |
Whole-plot | 1 | 32.00 | 6.84 | 0.0135 |
a-Experience | 1 | 32.00 | 6.84 | 0.0135 |
Subplot | 2 | 32.00 | 1.03 | 0.3683 |
8-Concentration | 1 | 32.00 | 1.10 | 0.3025 |
aB | 1 | 32.00 | 0.96 | 0.3338 |
Note: This table shows that only the experience variable has a significant effect on the cumulative ice-free lifeboat launch time response.
Key results of the model are shown in the effects plot in Fig. 17. The plot shows the cumulative ice-free time metric for all 36 participant trials of the emergency ice management scenario. Data are summarized with Fisher's least significant difference I-bars around predictions at each treatment combination.
Diagnostic checks of modeling adequacy show that assumptions of normally distributed residuals, heteroscedasticity, and independence of residuals with run order are all valid. A square root power transformation was applied to stabilize variance and thereby validate modeling assumptions.
Ice Management Tactics.
So far, considerable effort has gone into showing that a statistically significant difference exists between experienced seafarers and inexperienced cadets when it comes to performance in ice management. To measure the performance in ice management, we used two different metrics of overall effectiveness: average clearing, and cumulative ice-free lifeboat launch time. Differences were detected between groups only when making assessments with the emergency scenario. The precautionary scenario (Fig. 4) failed to detect any differences between experience groups. Suggestions as to why this may be the case are presented in the Discussion. For the emergency scenario, both metrics showed that experience had a significant influence on ice management effectiveness. These findings underscore an important question: what is the seafarer group doing that makes them more effective than the cadets? We examine this question in this section.
As a starting point, we may begin to understand what effective ice management looks like by plotting the tracks taken by seafarers during the 30 min simulation. Additionally, if we trace all the tracks taken by seafarers in one plot, and then repeat this for cadets' tracks, we should begin to see spatial differences in maneuvers that may characterize underlying differences in tactics. The problem is, if we do this we end up with very messy plots. One solution is to present “heatmaps” of the respective groups' tracks (Figs. 18 and 19). The heatmaps are constructed by dividing the simulation area into bins and counting instances in which the own-ship passes through a given bin during simulations. The aggregate counts for a given scenario are assigned colors: the higher the number, the brighter the corresponding color for that bin. This is a clearer way of visualizing the two groups' aggregate tracks. Figures 18 and 19 display such tracks for the high-level (7-tenths) concentration cases for the emergency scenario for cadets and seafarers, respectively. Similar plots can be constructed for the low-level (4-tenths) concentration cases, although they are not included here.

Heatmap for cadets' tracks during emergency scenario (7-tenths concentration case). Note: The lifeboat launch zone is not drawn to scale.

Heatmap for seafarers' tracks during emergency scenario (7-tenths concentration case). Note: The lifeboat launch zone is not drawn to scale.
From Figs. 18 and 19, it is obvious that spatial differences exist between the cadet group and the seafarer group. For one, seafarers appear to focus their position on a single area (visible as the only bright patch in the heatmap in Fig. 19), whereas cadets are divided into two or three relatively large areas (Fig. 18). From this insight, the seafarers' chosen maneuvering tactics may be characterized as more focused and more uniformly executed. Because participants were actively discouraged from discussing the experiment with others, each trial was independently orchestrated. And yet, there appears to be a high degree of similarity among seafarers' chosen tactics. Specifically, they appear to focus just upstream of the port lifeboat launch zone.
From boxplots of cumulative ice-free lifeboat launch time (Fig. 16), there was considerable variability in seafarer performance. So, despite seafarers performing more effectively than cadets on average, it merits a closer examination within the seafarer group to determine what may distinguish an individual's effective trial compared to another individual's ineffective one. Figure 20 plots two tracks based on the midships position of the own-ship during the 30 min simulations scenario. The two tracks represent the best and the worst of all seafarers, where best and worst are measured by corresponding highest and lowest amounts of cumulative ice-free lifeboat launch time, respectively. The resulting difference represents the largest single gap in performance between any two individuals for this scenario, including cadets. Clearly, positioning makes a difference, and the “best” track in this case demonstrates a highly effective tactic. The track shows a straightforward line heading toward the lifeboat launch zone, where it stops upstream, swings about to create a lee down-drift of its port side and holds position for the duration of the simulated scenario. The “worst” track, on the other hand, plots a track farther upstream of the FPSO and lifeboat launch zone, covering almost twice as much as distance as the best track.

Plot of best and worst tracks for emergency scenario (7-tenths concentration case). Criterion is cumulative ice-free lifeboat launch time.
An inspection of the Replay files provides further clues as to what may distinguish successful tactics. Figures 21 and 22 show that heading with respect to the ice also plays a major role. For instance, the best trial (Fig. 21) shows that a “wedging” maneuver (so-called by the seafarer who produced it), whereby the vessel's quarter is positioned close to the FPSO and its heading is approximately 30 deg off the FPSO heading, effectively traps the ice between the two vessels. Ice accumulated and eventually drifted around the wedge created by the stand-by vessel, effectively clearing the area downstream that required attention. The worst trial (Fig. 22) appears to show an attempt to clear ice sideways, using the side thrusters to clear ice while maneuvering upstream. The issue with this appears to be that ice drifting into the bow of the FPSO was deflected along the length of the port side by the current. From here, the ice subsequently drifted into the lifeboat launch zone, unimpeded by the vessel's presence. Moreover, during the best trial (Fig. 21), use of the aft console allowed good visibility of the deck, which would have been an advantage while working close to the FPSO. For the worst trial, on the other hand, the stand-by vessel was oriented bow-on to the FPSO, and visibility over the bow would have been limited. Much of the ice between the bow and the FPSO would have been completely hidden from view. Inspection of the Replay files can therefore provide valuable insights into good practices in ice management to complement plots of tracks, which show a more general picture of overall tactics.

Midway mark (15 min) during emergency scenario for best trial, measured by cumulative ice-free lifeboat launch time (7-tenths concentration case)

Midway mark (15 min) during emergency scenario for worst trial, measured by cumulative ice-free lifeboat launch time (7-tenths concentration case)
Exit interviews were conducted for all participants and these may also offer important clues about the tactics. For example, the best track from Fig. 20 (depicted in the Replay images in Fig. 21) was performed by a seafarer code-named C79, who was asked during the after-action interview to reflect on the tactics undertaken during the simulation scenario. C79 was a master with approximately 30 years of experience at sea, which included more than ten seasons of operations in sea ice (including ice management), most recently within the past 3 years. Asked about the strategy employed in the simulation scenario, C79 stated:
I caught a large floe, which is advantageous to push against. My stern thrusters were at 100% [power allocation], pushing against trapped ice. I used side thrusters to maintain position. I tried to maintain a 30-degree heading using my thrusters. Ice travelling down would drift down and around [my bow]. I moved fast at the start to take advantage of clear water. Then I slowed in ice to less than 3 knots.
Compared to the Replay file imagery (Figs. 21 and 22) and the plot of tracks (Fig. 20), exit interview transcripts such as this provide valuable qualitative information about ice management tactics. For example, we now know that a large ice floe was used strategically to block others, and that the ice trapped in the wedge was almost overpowering the own-ship. Additionally, when asked about what factors might be important for success in such a scenario, C79 replied, “[One should] get setup instead of moving too much.” The track plot (Fig. 20), which showed a direct path to a location upstream of the lifeboat launch zone and minimal movement thereafter corroborates this assertion.
In comparison, the “worst” track was performed by a master, code-named S41, who had accumulated 10 years of experience at sea. This included 3–10 seasons of operations in sea ice (including ice management), most recently within the past 3 years.
When asked about tactics employed for the emergency scenario, S41 stated that they had been attempting to implement an industry procedure. Details about which procedure this referred to were not provided. When asked about what changes might be done in a hypothetical repeat trial of the simulated scenario, S41 stated:
[In a repeat I would] come up closer to the bow of the FPSO. I would've cleaned out ice closer [to the FPSO], stern-first.
Interestingly, S41's remarks about positioning closer to the FPSO and maneuvering stern-first are both characteristics observed during C79's successful maneuver. This suggests that had S41 had the chance to repeat the trial, they would have applied a tactic similar to that of C79. Although learning effects were not directly measured in this experiment, this is a strong qualitative indicator that learning effects may exist for ice management simulator training.
Discussion
The experimental campaign reported here highlights the importance of appropriate simulation scenario design when assessing ice management effectiveness in a marine simulator. The precautionary scenario (Fig. 4) failed to detect any significant difference between the two experience groups, whereas the emergency scenario did. In the former, individuals were tasked with keeping the port and starboard lanes of the FPSO clear of ice as a precautionary measure so that lifeboats would be able to launch safely in the event of a major event. Why did this scenario fail to detect differences between experience groups? The question may be best answered by the experienced seafarers who performed it. Transcripts of exit interviews taken shortly after simulation showed that out of 18 seafarers, seven said that “clearing of both sides was not possible with a single vessel,” with five of these seven specifying that “two vessels are required for such a scenario.” Furthermore, two individuals stated that they had performed a similar precautionary ice management exercise “in real life,” and in those cases at least two stand-by vessels had been on-site to complete the job. It follows that the scenario would have been better suited for two or more stand-by vessels working together, rather than just the one own-ship. In other words, the precautionary scenario was challenging by virtue of having a single vessel attempting a multivessel mission. This challenge outweighed the differences in performance that could be attributed to differences in officer experience alone.
There were some important limitations with the simulator that must be noted. Most importantly, this included the inherent difficulties associated with judging spatial distance from a two-dimensional screen. With no radar present, either, the officers had to rely on the very high frequency radio for information about distances to specified targets. The officers' watch-keeper (role-played by an experimenter at the Instructor Station) would report back distances measured from a built-in software tool, when asked. Another limitation of the simulator stemmed from the omission of transverse and astern velocities on the display screen in the bridge console. Only total speed and heading were displayed. During ice management close to the FPSO, subtle shifts in speed in any direction were important for the officer to know about as he or she adjusted controls to maintain good position. Finally, visibility from the simulator bridge console was an issue. In a real ship, officers can walk to the wing and look out along the side of the vessel to see ice. This probably made ice management in the simulator more challenging since ice was effectively invisible off the bow and sides of the own-ship. Improvements could perhaps be made to the simulator technology and the resulting data collected would likely be more reliable. Regardless, the simulator used in the experiment met the intended scope of the research: it was good enough to detect differences between bridge officer experience groups and it was able to characterize general principles of good ice management practices.
So far, it has not been mentioned whether the seafarers had had formal training in ice prior to completing the test trial. This information was collected in experience questionnaires and it was found that out of the 18 experienced seafarers tested, only six reported to have had formal training in ice management, with four of these describing their training as “Basic” and the remaining two describing it as “Advanced.” Despite this, 17 out of 18 reported to have done ice management in the field. The lack of formal training in ice may explain the high degree of variability generally observed in performance measured within the seafarer group (Figs. 8 and 16).
It was also observed during exit interviews that five out of 12 seafarers operating offshore supply vessels and AHTS vessels stated that they were accustomed to dynamic positioning (DP) for station keeping and maneuvering, which is the industry norm for vessels of these types. However, from the exit interview it was clear that few, if any, officers would rely on DP for station keeping in drifting pack ice. Apparently, this was due to the nature of loading that sea ice imposes on the hull. With this in mind, the experiments may be viewed as scenarios presenting conditions that required manual take-over of positioning controls. Bainbridge argues that the skills required for manual take-over from an autonomous control system like DP need special attention and training if they are to be accomplished with success [21]. In our experiments, only a third of operators had formal training in this area despite all but one having completed it in the field, potentially highlighting a gap in training for manual take-over of vessel position and station keeping in drifting pack ice.
Learning effects, which are observed when a scenario is repeated with improved results, are often of interest in simulator experiments. Learning effects were not directly measured in this experiment because repeat trials were not conducted. Still, they were indirectly indicated from the interviews. Specifically, we compared the individuals' performance as-measured in the emergency scenario to corresponding performance scores as-reported in a self-assessment during exit interviews. The self-reported score was on a subjective scale of 1 to 5, where 1 was poor and 5 was excellent. For the measured performance, the cumulative ice-free lifeboat launch times for each trial were scaled linearly from 1 to 5 so that 1 equated to 0 min and 5 equated to approximately 18 min (95th percentile of recorded cumulative times). Note that because the 95th percentile was used, some “actual” scores were above 5. The results were plotted and a locally weighted scatterplot smoothing (LOESS) fit (weight parameter = 0.75) was used to explore trends (Fig. 23). The LOESS fit matched very closely to the unity line, indicating that on average, participants were strikingly accurate when perceiving the effectiveness of their own performance. A closer look, though, reveals this trend is only slightly linear (Pearson's ρ = 0.56, p = 0.0003), indicating that this perception is only accurate on average for a relatively large group size and that variation is quite high. Still, this finding indicates that on average, individuals may accurately recognize their own needs for training after having completed a simulation scenario.
Conclusion
Experienced crew have been shown to perform ice management more effectively than inexperienced crew in a simulator experiment. Based on the results, it can be concluded that experience level of bridge officers onboard a vessel tasked with an ice management operation will determine the resulting effectiveness. Effectiveness was measured in two ways: (i) average ice clearing in a defined area (tenths concentration) and (ii) cumulative ice-free lifeboat launch time (in minutes). The latter of the two metrics applied to the emergency scenario, only.
The hypothesis that the human factor of expertise influences effectiveness of ice management operations has been formally tested and accepted in this experimental campaign. Also, the hypothesis that the human factor of experience level has a larger effect when combined with higher concentration (positive interaction effect) has been formally tested and rejected. The experimental approach that was adopted represents a methodological contribution to the investigation of human factor issues in operational settings.
Results of the experimental campaign have also provided a basis for assessing operating tactics in ice management. This could offer insights for informing good practices in ice management as they apply to offshore operations. The gap between the two groups, as well as the variability within the respective groups, also provides a quantitative basis for the design of a training curriculum that could close the performance gap and reduce its variability.
Acknowledgment
The authors gratefully acknowledge the financial support provided by the Natural Sciences and Engineering Research Council of Canada (NSERC)/Husky Energy Industrial Research Chair in Safety at Sea and by the American Bureau of Shipping (ABS) Harsh Environment Technology Center. The first author acknowledges with gratitude the financial support provided by the Research and Development Corporation's Ocean Industry Student Research Award.